NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-25-01KQ2H30YMYMKD615SMTAE5HZ5
Run Details
Model:
DeepSeek V4 Flash
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
11.8s
Cost:
$0.000216
Provider:
DeepSeek
Evaluation summary
Score
Evaluator
Details Link
100.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
50.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly traced a finger along the creased blueprint spread across the greasy diner table, her nail catching on the X she’d drawn in ballpoint pen. “The bounce house is the key. It creates a thirty-second window of pure, shrieking chaos.” Bill rubbed his temples, a stack of deflated party hats wobbling beside his coffee. “I’m just saying, a man dressed as a pirate, brandishing a cutlass made of cardboard, isn’t exactly subtle. Won’t the parents notice Captain Hook stealing the goodie bags?” “He’s not Captain Hook. He’s Captain Seamore, the Pirate Clown. The nose is red, the peg leg is a squeaky balloon. And no, they won’t notice, because you’ll be crying.” Molly stabbed a finger at the schedule. “Right when the magician starts pulling the rabbit out of the hat. You get hysterical. A full-blown meltdown. Sobbing about your missing treasure chest. The parents love that. It’s empathy, Bill. Pure theater.” “I can’t cry on command, Mol. I’ll be thinking about the balloon animal giraffe I’m supposed to steal. It’s hard to weep into a handlebar mustache.” “Practice. Think of your car. The one you lost in that poker game.” She leaned in, her voice dropping. “Picture the alternator. The bald tires. The faded ‘I Brake for Yard Sales’ sticker. Well up, Bill. Well up.” Bill’s eyes grew distant. He sniffled once. Twice. A single, improbable tear traced a path down his cheek, disappearing into the fake beard he hadn’t washed since a kids’ wedding gig in June. “Okay. I’m sad.” “Beautiful. That’s the distraction. While you’re weeping over a paper cutout of a parrot, I’m under the cake table. The presents? They’re all gift-bagged in that crinkly cellophane. Silent. We can be in the van before the mother can start a passive-aggressive email chain about the missing ‘Minecraft Dream House Deluxe.’ ” “But what about the piñata? It’s filled with toffee bombs. Those are worth more than the dollhouse.” “We skip the piñata. Too loud. Too sticky. Stick to the script, Bill. The crying. The pirate. The silent bag of a hundred dollars’ worth of plastic trinkets. We’re artists, not amateurs.” Bill wiped his nose on his sleeve and straightened his pirate hat, which flopped over one eye. “One thing. How do I get the parents to hand over their phones for the group photo?” Molly smiled, slapping the blueprint shut. “You tell them the flash is broken and you’ve got a better one. A clown with a Polaroid. They line up, say cheese, and while they’re smiling, you drop the camera and scream that you see a bird. They all look up. I take the loot. We leave via the back gate, past the sad pony.” “What sad pony?” “The one we’re not stealing. It’s for ambiance. Now, finish your coffee. We’ve got a two o’clock party at the community center. Theme is ‘Under the Sea.’ Bring your best crab-claw hand motion. And for God’s sake, stop crying. You look like a drowned rat in a bandana.”
created_at
"2026-04-25T14:35:52.664Z"
4/25/2026, 2:35:52 PM
model
"deepseek/deepseek-v4-flash-20260423"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
965
moderation_latency
(null)
generation_time
11185
tokens_prompt
77
tokens_completion
794
native_tokens_prompt
67
native_tokens_completion
738
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.00021602
router
(null)
provider_responses
0
endpoint_id
"722e0746-2034-4d6e-a08f-e6271849ee9e"
id
"2cd6b46d-4218-4778-8375-113b97f6f626"
is_byok
false
latency
281
model_permaslug
"deepseek/deepseek-v4-flash-20260423"
provider_name
"DeepSeek"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777127752-l3J5RxIXF6Bx9BifdAwf"
session_id
(null)
api_type
"completions"
id
"gen-1777127752-OtuVqunvjryo2VXz71i6"
upstream_id
"2cd6b46d-4218-4778-8375-113b97f6f626"
total_cost
0.00021602
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepSeek"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Matches word count
n/a
neededClean
false
words
500
0.00%
Dialogue to Total Word Ratio
Ratio: 79.60%, Deviation: 49.60%
neededClean
false
wordsTotal
505
wordsDialogue
402
50.0000%