NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-16-01KHM5HQ0363TV4WH5VPC26B42
Run Details
Model:
DeepSeek V3 (2025-03-24)
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
18.7s
Cost:
$0.001164
Provider:
Crusoe
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The moon hung low over the quiet suburban neighborhood, casting long shadows across the manicured lawns. Molly adjusted her black ski mask—decorated with tiny unicorns because, as she put it, “just because we’re criminals doesn’t mean we can’t have *flair*”—and turned to Bill, who was busy untangling himself from a string of party streamers he’d somehow gotten wrapped up in. “Bill,” Molly whispered, “we’re professionals. Act like it.” “Professionals?” Bill snorted, finally ripping the streamers free. “We once stole a golf cart and crashed it into a sand trap. That doesn’t scream ‘elite heist team’ to me.” “Semantics,” Molly said, waving a gloved hand. “The plan is simple. We sneak in, swap out the presents for identical-looking boxes full of socks and educational workbooks—disappointing *and* practical—then waltz out before anyone notices.” Bill squinted at the brightly lit house, where squeals of laughter and the faint thump of kid-friendly pop music spilled from the windows. “You think a house full of sugared-up six-year-olds isn’t gonna notice two adults in unicorn masks stealing their loot?” “They’ll be too busy with the *real* distraction,” Molly said, grinning beneath her mask. “The clown?” “The clown.” Bill groaned. “We agreed *no clowns* after the incident at the—” “Clown’s already in position,” she interrupted, tapping her earpiece. “Right, Jerry?” A muffled voice crackled through the earpiece. *”Fear me, for I am the bringer of balloon animals and existential dread!”* Bill pinched the bridge of his nose. “You hired *Jerry* again? The guy who gets stage fright and starts reciting Shakespeare?” “He’s got commitment issues, not skill issues,” Molly said defensively. Then, after a pause: “And if all else fails, we just grab the presents and *run*.” Bill sighed. “Fine. But if we end up in a high-speed chase with a minivan full of angry parents, I’m blaming you.” They crept toward the house, slipping in through the side gate. The backyard was a riot of confetti, a bouncy castle quivering ominously, and a waist-high mob of children swarming the poor clown, who was—true to form—dramatically monologuing. *”To bounce or not to bounce, that is the question—”* Molly gave Bill a thumbs-up. “Perfect. They won’t even—” “STRANGER DANGER!” a tiny voice shrieked. All heads turned. A kid in a dinosaur costume pointed directly at them. The entire party froze. Bill swallowed hard. “New plan?” “Yep. RUN.” They bolted, arms full of half-grabbed presents, weaving past inflatable slides and a very concerned piñata. Behind them, a chorus of grown-ups shouted, Jerry the Clown began soliloquizing *Macbeth*, and, inevitably, the first juice box was thrown. As they vaulted the fence, Bill wheezed, “Remind me—why do we keep doing this?” Molly grinned, clutching a glittery gift bag marked *For Emily, Age 6*. “Because, Bill—where else can you get this kind of adrenaline *and* a goodie bag?”
created_at
"2026-02-16T21:26:39.560944+00:00"
2/16/2026, 9:26:39 PM
model
"deepseek/deepseek-chat-v3-0324"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
422
moderation_latency
(null)
generation_time
18621
tokens_prompt
77
tokens_completion
841
native_tokens_prompt
67
native_tokens_completion
754
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0011645
router
(null)
provider_responses
0
id
"cmpl-73605f94a628a9432dae2ff88b3d8305"
status
200
is_byok
false
latency
308
endpoint_id
"eb0625b2-17a1-492d-b3cc-e35091f1431d"
provider_name
"Crusoe"
model_permaslug
"deepseek/deepseek-chat-v3-0324"
api_type
"completions"
id
"gen-1771277179-j0ifB8U20kueshfQ9daw"
upstream_id
"cmpl-73605f94a628a9432dae2ff88b3d8305"
total_cost
0.0011645
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Crusoe"
Evaluation details
Result
Evaluator
Details
Meta Data
43.3783%
Matches word count
n/a
neededClean
false
words
483
63.7232%
Dialogue to Total Word Ratio
Ratio: 47.00%, Deviation: 3.00%
neededClean
false
wordsTotal
483
wordsDialogue
227
53.5508%