NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-25-01KQ2EXCMVK8SX16B3SSNCYKCM
Run Details
Model:
DeepSeek V4 Flash (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
13.0s
Cost:
$0.000252
Provider:
DeepSeek
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly traced a finger along the blueprint spread across the greasy diner table. It was a crayon drawing on pink construction paper, complete with glitter stains and a lopsided castle. “The target is the Magical Unicorn Rainbow Grotto,” she said, tapping a smudge that might have been a moat. Bill squinted at it. “That’s a bouncy castle.” “It’s a perimeter defense. We go under it. Ropes and carabiners. Silent as ghosts.” “Molly, the thing’s inflatable. We’d make it fart so loud the whole party hears us.” She sighed and pulled out a second piece of paper, this one a proper floor plan of the Johnson residence backyard, stolen from the HVAC repair guy. “Fine. We use the diversion. I set off the confetti cannon at 2:47. While the kids are screaming with joy, you slip through the side gate, clear the gift table in ninety seconds, and dump everything into this.” She slid a pink My Little Pony backpack across the table. Bill picked it up. It jingled. “It has a lunchbox inside.” “For the smaller presents. The big ones we wrap in a blanket and drag through the hedge. We’re not animals.” “We’re thieves. Stealing from six-year-olds.” “Don’t judge the market, Bill. This is prime merchandise. Unicorn slippers. Glow-in-the-dark teething rings. A Sparkle Princess Magic Wand that shoots actual glitter. We sell that on eBay and we’re set for a month.” Bill rubbed his face. “I feel greasy.” “That’s the diner coffee.” “No, I mean. Morally greasy. My niece is six.” Molly fixed him with a stare. “Your niece’s birthday party was last month. You stole a cheese platter from it.” “That was a charcuterie board, and it was for survival.” “Same energy. Look, these kids get so many presents they won’t miss them. They’ll be drowning in Paw Patrol wrapping paper before the cake is even cut. We’re just… redistributing wealth. Late-stage capitalist reclamation.” Bill stared at the crayon castle. “What if the birthday girl cries?” “Then you cry louder. Claim you’re the party clown who’s emotionally overwhelmed by joy. The parents will give you a hug and a slice of cake, and you’ll have a clear path to the exit.” He grunted. “And the escape route?” “The neighbor’s dog has a dog door. It’s a big dog. We crawl through, into their garage, out the side, and onto the bus. We wear animal onesies. I’ll be a panda. You be a giraffe. No one remembers a panda and a giraffe.” “Giraffes don’t steal presents.” “They do today.” Molly folded the blueprint. “We go in at 2:45. The party games will be winding down. The parents will be zoning out on cheap wine. The piñata will be a broken, weeping carcass. It’s the perfect window.” Bill chewed on a fry. “Okay. But if I have to eat any of that cake, I’m taking the Sparkle Princess Wand for myself.” “Deal.” She held out her hand. He shook it, greasy and reluctant. “One more thing,” he said. “What if the unicorn is real?” Molly paused. “What?” “The bouncy castle. Has a giant unicorn head. If it comes to life, we’re done.” She stared at him for a long moment. “Then we ride it out of there. Giddyap.” He almost smiled. Almost.
created_at
"2026-04-25T13:57:50.878Z"
4/25/2026, 1:57:50 PM
model
"deepseek/deepseek-v4-flash-20260423"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
540
moderation_latency
(null)
generation_time
12724
tokens_prompt
77
tokens_completion
939
native_tokens_prompt
67
native_tokens_completion
865
native_tokens_completion_images
(null)
native_tokens_reasoning
60
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.00025158
router
(null)
provider_responses
0
endpoint_id
"722e0746-2034-4d6e-a08f-e6271849ee9e"
id
"f7d62331-ed17-4379-b39b-0e2a64d4f077"
is_byok
false
latency
286
model_permaslug
"deepseek/deepseek-v4-flash-20260423"
provider_name
"DeepSeek"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777125470-tzXylHvHwHy0bhAZNdur"
session_id
(null)
api_type
"completions"
id
"gen-1777125470-o3r2bwA1akDg524tcmAt"
upstream_id
"f7d62331-ed17-4379-b39b-0e2a64d4f077"
total_cost
0.00025158
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepSeek"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
542
0.00%
Dialogue to Total Word Ratio
Ratio: 73.91%, Deviation: 43.91%
neededClean
false
wordsTotal
548
wordsDialogue
405
0.0000%