NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-25-01KQ2H73M2M2HV5SMWK8NWSPGY
Run Details
Model:
DeepSeek V4 Flash (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
10m 32s
Cost:
$0.000250
Provider:
DeepInfra
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly adjusted the strap of her bag, a manila folder stuffed with party favors instead of blueprints. “Okay, Bill, run it by me again. Slowly. Like I’m a five-year-old.” Bill squinted at the hand-drawn map spread across the stained diner table. “We come in through the kitchen. The caterer’s a guy named Dave, and he always props the back door with a brick. We swap the brick for a different brick—one that’s exactly the same color but made of foam. Door won’t close, we have a silent entry.” “And then what? We waltz out with the goodie bags?” Molly picked a fleck of dried syrup off the table. “No, no. We hide behind the bouncy castle. There’s a seam in the fabric near the blower motor. I’ve already cut it from the inside. We stash the presents there, then we slip out during the piñata. Everyone’s distracted by a blindfolded kid swinging a stick at a paper donkey.” Molly tapped her chin. “What about the magician? That guy’s got doves. Doves are unpredictable. What if one lands on the bouncy castle and punctures it?” “It’s a reinforced vinyl castle. A dove’s not doing squat. But a kid with a lollipop stick could pop it, so we avoid the three-to-six demographic entirely. Stick to the perimeter.” Bill jabbed a finger at the map. “Here’s the gift table. It’s unguarded for exactly four minutes while the parents do the ‘happy birthday’ song. That’s our window.” “Four minutes to steal everything? That’s tight.” Molly chewed her lip. “What if somebody needs a diaper change? Parents always leave during diaper changes. Could be a wrinkle.” “We have a decoy.” Bill pulled a small, wrapped box from his pocket. It was crudely taped with a sticky note that read: “To: Birthday Boy, From: Your Secret Admirer.” “We plant this on the gift table. It’s empty. But it’s wrapped in the same paper as the rest. When they open it, they’ll think the real presents were never there. Just a practical joke from a weird uncle.” Molly stared at the box. “You spent money on wrapping paper? For a decoy?” “Dollar store. Used a coupon.” “Okay. Fine. But what if the birthday kid throws a tantrum? I’ve seen kids. They can lock down a whole party with a meltdown. Security hinges on cupcake distribution.” Bill sighed. “We’ve got backup snacks. I brought a bag of peanut-free gummy bears. If a kid starts screaming, I chuck a handful in the opposite direction. Chaos buys us time.” “And the parents? They’ll notice the presents are gone before we’re out the door.” “Not if we swap them.” Bill unzipped a duffel bag. Inside were identical-looking boxes wrapped in the same shiny paper as the ones on the gift table. “Empty boxes. We replace, we walk. They open them later, find nothing. Assume their kid already ripped them open in a frenzy and lost the toys. Standard birthday amnesia.” Molly nodded slowly. “You’ve put way too much thought into this. I’m almost proud.” “Almost?” “I’m still mad about the doves. They’re going to peck our eyes out.” Bill folded the map. “Fine. We bring a squirt gun. Doves hate water.” “You’re a genius, Bill.” “I know. Now pass me the syrup. I’m getting a pancake.”
created_at
"2026-04-25T14:38:06.471Z"
4/25/2026, 2:38:06 PM
model
"deepseek/deepseek-v4-flash-20260423"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
94876
moderation_latency
(null)
generation_time
622201
tokens_prompt
77
tokens_completion
936
native_tokens_prompt
67
native_tokens_completion
861
native_tokens_completion_images
(null)
native_tokens_reasoning
69
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.00025046
router
(null)
provider_responses
0
endpoint_id
"722e0746-2034-4d6e-a08f-e6271849ee9e"
is_byok
false
latency
281
model_permaslug
"deepseek/deepseek-v4-flash-20260423"
provider_name
"DeepSeek"
status
(null)
1
endpoint_id
"934a69f9-bd54-474b-beca-24560f721e12"
id
"2ea00ad33c4a4374bc9218c3184ac11d"
is_byok
false
latency
44
model_permaslug
"deepseek/deepseek-v4-flash-20260423"
provider_name
"DeepInfra"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777127886-RVEjHXVSqz0NTwn56sw6"
session_id
(null)
api_type
"completions"
id
"gen-1777127886-NLIwk2A0aP7JsjJAz9Yx"
upstream_id
"2ea00ad33c4a4374bc9218c3184ac11d"
total_cost
0.00025046
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepInfra"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
546
0.00%
Dialogue to Total Word Ratio
Ratio: 79.93%, Deviation: 29.93%
neededClean
false
wordsTotal
553
wordsDialogue
442
0.0000%