NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-25-01KQ2FCW8A546R7JX812QSKP0K
Run Details
Model:
DeepSeek V4 Flash (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
13.0s
Cost:
$0.000252
Provider:
DeepSeek
Evaluation summary
Score
Evaluator
Details Link
14.30%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
7.1507%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
“I’m telling you, Bill, it’s a goldmine,” Molly said, jabbing a finger at the crumpled invitation on the table. “Eight-year-olds. Unsupervised. A bounce house. And a mountain of presents.” Bill squinted at the glittery unicorn on the card. “We’re stealing from a kid’s birthday party?” “We’re stealing the presents from the kid’s birthday party. There’s a difference. The kid gets the trauma of disappointment, which builds character. We get the haul.” “I don’t know, Moll. The last time you said ‘unsupervised’ we ended up in a petting zoo with a llama that coughed on my wig.” “That was a tactical error. This is tactical perfection. Look at the timeline. Parents drop the kids off at 2 PM. Party ends at 5. Between 3 and 4, the hired clown does his balloon-animal thing—distraction maximum. We slip in through the back gate, grab the gift table, and exit via the neighbor’s hedge.” Bill rubbed his chin. “What about the clown? Clowns see everything. They’re like… jester-eyed security cameras.” “His name is Bubbles the Buffoon. I’ve done my research. He’s seventy-three years old and has a hip replacement. If we move fast, his only reaction will be a slow, pained pivot.” “And the kids?” “What about them? They’re eight. They’ll be eating cake and punching each other over a glow-in-the-dark yo-yo. We’ll be ghosts.” Bill picked up the invitation, turning it over. “It says ‘No adults, just fun!’ What kind of parent writes that? That’s an invitation to get robbed.” “Exactly, it’s a cry for help. We’re answering it. With larceny.” “What’s our cover story? We can’t just stroll in.” Molly grinned. “We’re the replacement jugglers. I’ve got a van full of beanbags and a bad French accent.” “You don’t speak French.” “Doesn’t matter. They’re children. I’ll say ‘bonjour’ and throw things at them. They’ll love it.” Bill sighed, defeated. “Fine. But if I have to wear a propeller hat again, I’m out.” “No propeller hat. I’m thinking matching tracksuits. We’ll look like very disappointed gym teachers.” “And the presents? What do we do with a pile of Lego sets and rainbow slime?” “We sell them on the internet. There’s a niche market for gently used birthday loot. Call it ‘pre-loved party swag.’ Emotional markup.” Bill nodded slowly. “You know, there’s a part of me that thinks this is actually genius.” “That part is your brain finally working. The rest of you is still eating those gas-station hot dogs.” “They were on sale!” “Anyway, we strike Saturday. Three PM sharp. You bring the duffel bags, I’ll bring the cover story.” “And what if the kid’s grandma shows up? The one who knits judgmentally?” Molly threw her hands up. “Then we offer her a cut. Grandmas love a good hustle.” “Fine. But I’m not hugging any llamas.” “There won’t be llamas, Bill. There’s cake. And presents. And a seventy-three-year-old clown who’s about to have a very confusing afternoon.” Bill stared at the unicorn invitation one last time. “I can’t believe we’re stealing from children.” “Not stealing,” Molly corrected, snatching the card. “Liberating. Now go buy a red nose. We’ve got a party to crash.”
created_at
"2026-04-25T14:06:18.384Z"
4/25/2026, 2:06:18 PM
model
"deepseek/deepseek-v4-flash-20260423"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
747
moderation_latency
(null)
generation_time
12866
tokens_prompt
77
tokens_completion
965
native_tokens_prompt
67
native_tokens_completion
865
native_tokens_completion_images
(null)
native_tokens_reasoning
50
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.00025158
router
(null)
provider_responses
0
endpoint_id
"722e0746-2034-4d6e-a08f-e6271849ee9e"
id
"948782b9-ea4e-41c0-b6f9-7092ebf2bcd8"
is_byok
false
latency
351
model_permaslug
"deepseek/deepseek-v4-flash-20260423"
provider_name
"DeepSeek"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777125978-NWJ8tsKYzYXur482Nguy"
session_id
(null)
api_type
"completions"
id
"gen-1777125978-IXSuQYFNk806R2bqEmnB"
upstream_id
"948782b9-ea4e-41c0-b6f9-7092ebf2bcd8"
total_cost
0.00025158
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepSeek"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
14.30%
Matches word count
n/a
neededClean
false
words
521
0.00%
Dialogue to Total Word Ratio
Ratio: 88.72%, Deviation: 18.72%
neededClean
false
wordsTotal
532
wordsDialogue
472
7.1507%