NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-25-01KQ2E09QNAFH992K668QY4A4B
Run Details
Model:
DeepSeek V4 Flash (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
13.1s
Cost:
$0.000260
Provider:
DeepSeek
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Bill thumbed the edge of the party invite like it was a winning lottery ticket. “You sure about this, Molly? Kids’ parties are… loud.” “Loud means distracted.” Molly adjusted the clown wig on her head, the red curls springing back like angry snakes. “We’re not robbing a bank, Bill. We’re robbing a bouncy castle.” “And the presents. You said presents.” “Obviously the presents. But we start with the castle. Get inside, blend in. I do balloon animals. You do face paint. We smile, we clap, we wait for the sugar crash.” Bill chewed his lip. “I don’t do face paint. Last time I tried a tiger, the kid looked like a chewed-up orange.” “Then do a dog. A simple dog. A nose, a dot, three whiskers. You can’t mess up a dog.” “I can. I drew a snail once. The mother cried.” Molly pinched the bridge of her nose, smearing the faux freckles she’d painted on. “Fine. You do the balloon animals. I’ll do faces. And you’re not making a sausage dog. You’re making a sword. Every kid wants a sword. Hand them a sword, they stop asking questions.” “What if they ask for a unicorn?” “Then you say the unicorn is hiding in the cake. They’ll believe you. They’re seven.” Bill shuffled his feet. The floor of their rented storage unit was cold concrete, covered in deflated pool toys and a suspicious stain. “And after the sugar crash? How do we get the presents out?” “We won’t be getting them out. The parents will.” Molly grinned, a little too wide. “I’ve got a script. During the piñata smash, I collapse. Fake seizure, very dramatic. Everyone panics. The mother runs for the phone. The father tries to catch the kid who just hit a unicorn with a stick. You, meanwhile, swap the gift bags.” “Swap them with what?” “With identical gift bags full of rocks and tissue paper. We grab the real ones from the trash bin out back after the party ends.” Bill squinted. “That’s… not terrible.” “Thank you.” “But what if the mother is a nurse?” Molly’s smile faltered. “A nurse?” “My sister’s a nurse. She’d see a fake seizure from across the room. She’d jab you with an EpiPen just to make sure.” Molly considered this. “Fine. You collapse. You do the dramatic fall. I’ll swap the bags.” “I don’t know how to fake a seizure.” “Just flop on the ground and shake. Kids love that. They’ll think it’s a dance.” Bill groaned. “And the bouncy castle? We’re really stealing a bouncy castle?” “Not the whole castle. Just the blower. The fan. Without it, the castle deflates. Everyone evacuates. Then I take the fan, sell it on eBay. Fifty bucks, easy.” “For a blower we risked a children’s party.” “It’s about the journey, Bill. The principle.” “The principle of stealing a blower from a unicorn-themed soiree?” Molly placed a hand on his shoulder. “We are professionals in a world that does not respect our craft. But a children’s party? It’s the perfect crime. Low security. High emotional payoff. And there’s cake.” Bill sighed. “I’m not eating that cake. It’s probably gluten-free.” “Then you’re missing out. Now help me practice my balloon rabbit.” She handed him a pump. He took it. “One condition. I get to keep the sword.” “Deal.”
created_at
"2026-04-25T13:41:57.625Z"
4/25/2026, 1:41:57 PM
model
"deepseek/deepseek-v4-flash-20260423"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
734
moderation_latency
(null)
generation_time
12953
tokens_prompt
77
tokens_completion
984
native_tokens_prompt
67
native_tokens_completion
894
native_tokens_completion_images
(null)
native_tokens_reasoning
72
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0002597
router
(null)
provider_responses
0
endpoint_id
"722e0746-2034-4d6e-a08f-e6271849ee9e"
id
"c81e87fe-1a71-461b-891b-e2a282c12174"
is_byok
false
latency
445
model_permaslug
"deepseek/deepseek-v4-flash-20260423"
provider_name
"DeepSeek"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777124517-bm1Clxxq6777RhrTLqzu"
session_id
(null)
api_type
"completions"
id
"gen-1777124517-jNUJ85z5ggT4uZQNJpul"
upstream_id
"c81e87fe-1a71-461b-891b-e2a282c12174"
total_cost
0.0002597
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepSeek"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
552
0.00%
Dialogue to Total Word Ratio
Ratio: 81.08%, Deviation: 31.08%
neededClean
false
wordsTotal
555
wordsDialogue
450
0.0000%