NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-25-01KQ2HWZTVER0CVHMZJC7EBHSE
Run Details
Model:
DeepSeek V4 Pro (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
38.1s
Cost:
$0.000978
Provider:
DeepSeek
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0004%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly chewed the end of her pen, frowning at the crayon-drawn blueprint on the napkin. “Alright, run me through it again. The bouncy castle is the primary distraction?” Bill, hunched behind the steering wheel of a van that still smelled faintly of old ice cream, nodded with the solemnity of a general. “The bouncy castle is the cornerstone of the whole operation, Moll. Absolute bedrock. I’ve zip-tied exactly eleven helium balloons to the primary intake valve. The moment that thing’s at full inflation, it’ll look like a sad, inflatable deity hoovering up all the children’s hopes and dreams.” “And the clown?” Molly asked, tapping the napkin. “You said you’d secure the clown.” “Handled. He’s not a real clown, he’s my cousin Kevin. I owe him forty quid. He thinks he’s been hired for balloon animals, but I’ve given him a specific script. Every poodle he twists will have a subtly menacing underbite. It sows unease. Parents get distracted trying to comfort little Timmy whose balloon dog looks like it’s judged his entire existence. That’s our window.” Molly squinted. “Our window to roll in under the guise of… party inspectors?” “Upgraded present-safety auditors,” Bill corrected, tapping his fake laminated badge. “Official government agency: Department of Gift Integrity. We’re there because of a supposed lead-paint scare in off-brand action figures. We’ll look stern, carry clipboards, and ‘quarantine’ every single present into our biohazard bags. Bags which will, coincidentally, be the exact same colour as the big hessian sacks we brought.” “It’s the narrative that feels flimsy,” Molly said, leaning back. “What’s stopping Aunt Carol from just swatting us with a cheese-plate and telling us to jog on?” Bill grinned, a wide, manic thing. “Cake. More specifically, the cake topper. I’ve swapped the fairy princess for a small, realistic-looking plastic rat. Complete with a tiny party hat. I’ll reveal it during the noise of the bouncy castle crisis. The collective shriek will provide audio cover. You pop up at the far side of the gift table, I’ll claim to be from the local health department responding to a rodent complaint, and we’ll both just start grabbing anything wrapped. Confusion is our co-conspirator.” Molly stared at him, a slow smile spreading. “You’ve thought of everything except the piñata.” “The piñata’s a brute-force asset. Filled with glitter and those little snapping-poppers that sound like gunfire. Kevin’s on strict instructions to point the birthday boy towards it the second he hears me shout the code word. Which will be ‘fondant’.” “Why fondant?” “Because no one likes it. It’s the perfect, universally understood cry of distress. A parent will hear that and think the cake community has finally turned feral. They’ll look away for a second, and that’s when we ghost. You with the princess castle, me with the Lego. We rendezvous at the ice-cream van two streets over, and we’re just two slightly festive entrepreneurs with a cargo of reclaimed birthday cheer.” Molly tapped the napkin. “And the magician booked for two o’clock?” Bill’s face fell. “There’s a magician?” “Yeah. A real one. Doves, rings, existential dread for the under-fives.” Bill was silent for a moment, then shrugged. “We work the rabbit. A real magician is just another professional doing a job. If he pulls a rabbit out of a hat, I’ll claim it’s contaminated and needs to be detained under our new pet-therapy cease-and-desist order. We adapt. We’re ghosts, Moll. Present ghosts.” “You have a cease-and-desist order for rabbits?” “I have a very versatile laminator and a deep suspicion of cute things.” He started the van. “Now, let’s go ruin a six-year-old’s capital accumulation.”
created_at
"2026-04-25T14:50:08.577Z"
4/25/2026, 2:50:08 PM
model
"deepseek/deepseek-v4-pro-20260423"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
892
moderation_latency
(null)
generation_time
32986
tokens_prompt
77
tokens_completion
1241
native_tokens_prompt
67
native_tokens_completion
1091
native_tokens_completion_images
(null)
native_tokens_reasoning
194
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.000978315
router
(null)
provider_responses
0
endpoint_id
"d084170b-dd99-4e53-9ae1-b63881f3cfbf"
id
"2694f4ee-9237-44b3-890a-dc9fdeab0197"
is_byok
false
latency
273
model_permaslug
"deepseek/deepseek-v4-pro-20260423"
provider_name
"DeepSeek"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777128608-wbLWDsz27S9JsNscYNp5"
session_id
(null)
api_type
"completions"
id
"gen-1777128608-bJhitfJ6PaxpvNd5ET9L"
upstream_id
"2694f4ee-9237-44b3-890a-dc9fdeab0197"
total_cost
0.000978315
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepSeek"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
599
0.00%
Dialogue to Total Word Ratio
Ratio: 85.25%, Deviation: 15.25%
neededClean
false
wordsTotal
617
wordsDialogue
526
0.0004%