NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-25-01KQ2DW4YT3EW9H2C5JZK3N7BE
Run Details
Model:
DeepSeek V4 Pro
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
31.3s
Cost:
$0.000903
Provider:
DeepSeek
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
“Bill, for the last time, you can’t just walk in dressed as a pirate and say you’re there to fix the skirting board.” “Why not? Children respect pirates and tradesmen.” “Children respect lactose-intolerant clowns before they respect a carpenter with an eye patch. You’ll be mobbed. They’ll want balloon swords. They’ll be disappointed when you pull out an actual chisel.” “Fine, what’s your grand plan? You want to be the magician? Pull a plush unicorn out of a hat and limp out with a hernia?” “Don’t be crude. I’m going as the cake designer. Professional, frazzled, carrying a large empty box for ‘structural repairs’ to the three-tier fondant nightmare. People see a person in distress holding a cardboard box, they open doors. And the box, Bill, is how the presents leave.” “A cake box? For forty wrapped gifts? What kind of geometry are you using?” “Nesting. It’s a series of progressively smaller boxes. Russian doll logic. The big one goes in first, then the medium, then I fill the gaps with loot. By the time I walk out, I’m just a sad woman with a collapsed cake infrastructure.” “And what am I? Your sous-chef? The idiot who scuffed the buttercream?” “You, my shaky-handed friend, are the entertainment. But not a pirate. Something low-impact. Something that encourages the adults to drink faster and the children to form a silent, orderly queue.” “Silent? At a five-year-old’s party? You’re planning a heist in a room full of tiny, unpredictable sociopaths who haven’t developed object permanence. You could grab a gift, put a napkin over it, and they’d forget it existed. We’re overcomplicating this.” “Except for the birthday boy’s mother, Brenda from accounting, who once did a weekend course in situational awareness and will absolutely remember a large man weeping into a napkin-covered Tonka truck.” “Fair point. She has the eyes of a hawk and the hospitality of a prison warden. So what’s my character?” “A children’s audio storyteller. Very on-trend. You sit on a small stool, you put on large headphones, hold a plush microphone, and you whisper ‘The Tale of Benjamin Bunny’ into an amplifier that isn’t plugged in.” “A silent storyteller. That’s your big idea. I mime books.” “You mime books while I, the grieving gateau specialist, slowly absorb the present table into a series of cardboard boxes. The beauty of the silent story is that the kids will sit there, cross-legged, intensely waiting for a sound that never comes. Their confusion buys us time. Adults will think it’s avant-garde Scandinavian pedagogy. They’ll zone out. Brenda will pour her first Sauvignon Blanc in weeks.” “And what if a child asks why the storyteller is sweating glitter?” “You blame the cake designer. Say I paid you in experimental sugar dust. We’re interconnected now, Bill. A fragile ecosystem of lies.” “So you scurry around with cardboard, I mute-Bunny the toddlers, and we just… stroll out with the goods?” “We don’t stroll out. We have the exit strategy. That’s why the cake box has a false bottom filled with actual, undeniably beautiful, professionally printed thank-you notes. If Brenda intercepts me and asks why my box looks suspiciously like a contraband sarcophagus, I fling it open and weep about a structural collapse. A sea of thank-you cards explodes everywhere. Her social gratitude reflex kicks in. While she’s fawning over the calligraphy, you follow the birthday child outside for the piñata shift.” “I’m not beating a paper mâché donkey with a stick. That’s the first sign of a man who’s lost control of his exit.” “No, you’re the safety supervisor. You don’t swing. You stand there holding a first-aid kit, softly narrating the brutal stick work in a calming ASMR tone. Then, you simply walk to our van and start the engine.” “While you’re inside, wrestling a forty-piece princess castle into a collapsing pastry illusion.” “Exactly. The key phrase is ‘tragic fondant accident.’ If things get spicy, you yell that phrase, and we abort to the bouncy castle.” “Tragic fondant accident. And what happens to the silent story microphone?” “I’ll wire it to actually play something. A single high-frequency tone. The dogs go nuts. Mass chaos. We’re ghosts, Bill. Well-dressed, emotionally fragile ghosts.”
created_at
"2026-04-25T13:39:41.661Z"
4/25/2026, 1:39:41 PM
model
"deepseek/deepseek-v4-pro-20260423"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1126
moderation_latency
(null)
generation_time
31206
tokens_prompt
77
tokens_completion
1169
native_tokens_prompt
67
native_tokens_completion
1004
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.000902625
router
(null)
provider_responses
0
endpoint_id
"d084170b-dd99-4e53-9ae1-b63881f3cfbf"
id
"357ed000-2ab7-4b06-be03-22811700d4a3"
is_byok
false
latency
276
model_permaslug
"deepseek/deepseek-v4-pro-20260423"
provider_name
"DeepSeek"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777124381-FlKutoRbOT3anPNFg5VL"
session_id
(null)
api_type
"completions"
id
"gen-1777124381-P8xEJJ4NxTFvLhWc3qwV"
upstream_id
"357ed000-2ab7-4b06-be03-22811700d4a3"
total_cost
0.000902625
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepSeek"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
697
0.00%
Dialogue to Total Word Ratio
Ratio: 100.00%, Deviation: 30.00%
neededClean
false
wordsTotal
715
wordsDialogue
715
0.0000%