NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-25-01KQ2HZTWESKGG57GZ4S9QH250
Run Details
Model:
DeepSeek V4 Flash (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
12.4s
Cost:
$0.000238
Provider:
DeepSeek
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
“I’m telling you, Bill, the golden ticket is the pinata,” Molly said, pacing the length of the rusted van. “Not the presents. The pinata.” Bill squinted at the spreadsheet taped to the windshield. “But the loot bag has Pokémon cards. Mint condition. And the birthday girl’s mom is a dentist—there’s going to be those little floss picks shaped like sharks. We could fence those for, like, three bucks a pop on Etsy.” “We’re not here to finance your retirement fund with novelty dental hygiene. The pinata is full of chocolate coins wrapped in actual gold foil. We melt those down, we’re talking a solid two ounces.” “Gold foil isn’t gold, Molly. It’s aluminum with a yellow-ish attitude.” “Well, your attitude is yellow-ish. Straight up chicken. Look, the plan is beautiful. The party starts at two. I go in as ‘Aunt Gertrude from Florida’—I’ve got the sundress, the fanny pack, the picture of a manatee I claim is my husband.” “You don’t have a husband.” “I have a laminated manatee. Close enough. I blend in, eat some cake, compliment the balloons. While I’m causing a distraction, you slip in through the back gate, grab the pinata, and we make a clean exit through the hedge maze the dad built.” Bill rubbed his neck. “Why do I have to go through the hedge maze? I’m not a Labradoodle. I’ll get lost in there with the whispering leaves and the dead ends—that’s a metaphor for my life, Molly.” “Bill, you’re the one who insisted on the monogrammed crowbar. You want to use it, you have to navigate some topiary.” “It’s not a crowbar, it’s a ‘gift opening tool.’ I had it engraved. ‘For Uncle Bill’s special presents.’” “That makes it worse. That makes it look like you’re a weird uncle who steals presents.” “I am a weird uncle who steals presents! That’s the whole gig!” Molly stopped pacing and jabbed a finger at the party invitation. “Listen. The pinata is scheduled for three-fifteen. We strike at three-twelve. The parents will be arguing over who has to clean up the glitter. The kids will be sugared up and spiraling into a group tantrum about who gets the good lollipop. It’s the perfect window.” “And what’s my exit strategy after I grab the pinata? Through the bounce house?” “You take the service alley behind the grill. The dad is a charcoal purist—he’ll be too busy fanning a fire that doesn’t exist. You break into a run, I meet you at the ice cream truck. We swap vehicles. I’ll be the one in the shark costume.” “Shark costume? Since when do we have a shark costume?” “Since I bought it off a man who used to be a mascot for a failed water park. It smells like tears and chlorine. Perfect for blending in at a kids’ party. No one questions a shark at a birthday.” Bill sighed. “Fine. But if I end up stuck in that hedge maze with a pinata and a crying toddler asking me where her mommy is, I’m naming you as my accomplice on the police report.” “You can’t name me. I’ll be an unnamed shark fleeing the scene. That’s the beauty of it.”
created_at
"2026-04-25T14:51:36.723Z"
4/25/2026, 2:51:36 PM
model
"deepseek/deepseek-v4-flash-20260423"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
526
moderation_latency
(null)
generation_time
12208
tokens_prompt
77
tokens_completion
899
native_tokens_prompt
67
native_tokens_completion
816
native_tokens_completion_images
(null)
native_tokens_reasoning
60
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.00023786
router
(null)
provider_responses
0
endpoint_id
"722e0746-2034-4d6e-a08f-e6271849ee9e"
id
"c0760b88-a76f-4b36-acef-4a0be747f964"
is_byok
false
latency
271
model_permaslug
"deepseek/deepseek-v4-flash-20260423"
provider_name
"DeepSeek"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777128696-CMWrYFukEAr5E0j7GIuw"
session_id
(null)
api_type
"completions"
id
"gen-1777128696-8xbvMIoh9rD9L87NLESg"
upstream_id
"c0760b88-a76f-4b36-acef-4a0be747f964"
total_cost
0.00023786
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepSeek"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
536
0.00%
Dialogue to Total Word Ratio
Ratio: 93.49%, Deviation: 23.49%
neededClean
false
wordsTotal
538
wordsDialogue
503
0.0000%