NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-16-01KHM5S1WV72CAN8PKA5Q4DZAR
Run Details
Model:
DeepSeek V3 (2025-03-24)
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
14.1s
Cost:
$0.000544
Provider:
AtlasCloud
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The dim glow of a single bulb flickered above Molly and Bill as they hunched over a hand-drawn map of the suburban cul-de-sac. The air smelled of stale coffee and desperation. “So let me get this straight,” Bill said, rubbing his temples. “We’re breaking into a six-year-old’s birthday party to steal presents?” Molly smirked, twirling a lock of her tangled red hair. “Not just any presents. Premium loot. Have you seen the toy aisle lately? Legos? Those little robot dogs? We’re talking a gold mine, Billy-boy.” “We’ve hit banks, art galleries—” “And how’d that work out?” Molly snapped. “Last time, we barely escaped with a painting of some sad clown.” “It was a Picasso!” “It was depressing!” She jabbed a finger at the map. “This is foolproof. No guards, no lasers, just a bunch of sugared-up kids and a stressed-out mom. Cake and grab, easy.” Bill sighed, scratching the scruff on his chin. “What’s the play, then?” Molly grinned. “You’re the clown.” “What?” She tossed a neon wig at him. “You’ve got the face for it. You’ll entertain the kids—juggle, make balloon animals, whatever—while I sneak off and load up the getaway wagon.” “The *station wagon*.” “Yes, the *station wagon*,” Molly said, rolling her eyes. “It’s a full-blown kids’ party, Bill. You think we’re rolling up in a black van? We need plausible deniability. ‘Oh no, officer, we’re just *very* enthusiastic party volunteers.’” Bill squinted at the wig like it had personally offended him. “I don’t juggle.” “Then learn. By tomorrow.” He tossed the wig back. “This is a low, Molly. Even for us.” “Oh, please,” she scoffed. “We’re thieves, not philosophers. Are you in or are you out?” Silence stretched between them. Outside, a distant ice cream truck played a jaunty tune, mocking their moral quandary. Finally, Bill groaned. “Fine. But if one of those little monsters cries on me, I’m bolting.” “Deal.” Molly clapped him on the back. “And hey—if we pull this off, we’ll be rolling in toy resale cash. No more sad clowns.” Bill nodded grimly, adjusting the wig. “You owe me big.” “I always do.” With that, they returned to their plans, the weight of their impending heist settling over them—along with the faint but undeniable smell of impending birthday cake disaster.
created_at
"2026-02-16T21:30:35.72382+00:00"
2/16/2026, 9:30:35 PM
model
"deepseek/deepseek-chat-v3-0324"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1053
moderation_latency
(null)
generation_time
14046
tokens_prompt
77
tokens_completion
665
native_tokens_prompt
67
native_tokens_completion
602
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.000544088
router
(null)
provider_responses
0
id
"a42a5c730d1a45c0ab69bd64694712af"
status
200
is_byok
false
latency
1053
endpoint_id
"5bb598f5-fd97-4f8b-91e4-ceab3743a03b"
provider_name
"AtlasCloud"
model_permaslug
"deepseek/deepseek-chat-v3-0324"
api_type
"completions"
id
"gen-1771277420-MnS59sbQEHsORqIAaXDd"
upstream_id
"a42a5c730d1a45c0ab69bd64694712af"
total_cost
0.000544088
cache_discount
1.44e-7
upstream_inference_cost
0
provider_name
"AtlasCloud"
Evaluation details
Result
Evaluator
Details
Meta Data
0.0000%
Matches word count
n/a
neededClean
false
words
390
1.4117%
Dialogue to Total Word Ratio
Ratio: 59.23%, Deviation: 9.23%
neededClean
false
wordsTotal
390
wordsDialogue
231
0.7059%