NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-25-01KQ2DHHVFW8ECY5PN09G6FS6X
Run Details
Model:
DeepSeek V4 Flash
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
10.7s
Cost:
$0.000222
Provider:
DeepSeek
Evaluation summary
Score
Evaluator
Details Link
100.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
49.9995%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
“This is insane,” Bill said, his voice a low rumble that barely stirred the dust motes in the garage. “We’re grown adults. We specialize in Fabergé eggs and high-end bonds. Now you want to knock over a bouncy castle?” “Not the castle, Bill. The presents.” Molly held up a glitter-frosted invitation. “Look at this. ‘Ethan’s Fifth Birthday: A Pirate’s Treasure Hunt.’ You see the subtext?” “The subtext is that I’ll need a tetanus shot from the sprinkles.” “The subtext is a tactical goldmine. There’s a piñata—bait. A cake big enough to hide in. And the loot table is unsupervised for a solid seven minutes during the magic show. We’re not stealing a tiara from a duchess. We’re stealing a Lego Death Star from a toddler.” Bill rubbed his temples. “And how do you propose we blend in? I don’t have a kid. Or a tattoo of a cartoon whale.” “We don’t need kids,” Molly said, pulling a duffel bag from the shelf. “We need character. You’ll be ‘Uncle Carl, the Balloon Artist Who Smells Like Gin.’ I’ll be his assistant, ‘Auntie Sparkles.’” “Auntie Sparkles.” “I’m wearing a wig with rainbow streaks. I’ve already cultivated a helium voice. Watch.” She took a breath and squeaked, “Who wants a giraffe sword? It’s a magic sword! It smells like disappointment!” “That’s terrifying,” Bill said. “Why disappointment?” “Because I’ll twist the balloon wrong on purpose. The kids will cry. Parents will panic. And while everyone is trying to calm a weeping six-year-old with a broken balloon sword, I drain the gift table.” Bill stared at the invitation. “And the parents? They’ll just let two strange adults into a children’s party?” “It’s a suburban birthday, Bill. They’re desperate for entertainment. I already sent the mother a confirmation email from a fake party company called ‘Pirate Pete’s Premium Parties.’ I even attached a press release about our insurance. She’s thrilled.” “You wrote a press release for a fake children’s entertainer.” “It had bullet points about safety harnesses for the treasure chest. I’m very thorough. You’re wearing a patch and saying ‘Arrr’ every other word.” “I will not say ‘Arrr.’” “You’ll say it when you see the gift table. There’s a Nintendo Switch under there. I’ve got a satchel lined with bubble wrap and a decoy piñata filled with old socks.” She patted a lumpy sack on the couch. Bill sighed. “What’s the exit strategy?” “The magician has a smoke machine. I’ve sewn a remote trigger into this tutu.” She lifted a poof of neon tulle. “When he does his big finale, I hit the button. We disappear in a cloud of ‘Candy Scent.’” “Candy Scent?” “The machine is labeled ‘Floral’ but I swapped the fluid. It smells like bubblegum and failure.” “I hate this,” Bill said. “You’ll love it when we’re splitting a SEGA Genesis at the fence line.” Bill picked up the bag of old socks. He put it down. “Fine. But I’m not doing a face painting booth.” “Too late. You’re a mermaid now.”
created_at
"2026-04-25T13:33:54.419Z"
4/25/2026, 1:33:54 PM
model
"deepseek/deepseek-v4-flash-20260423"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
595
moderation_latency
(null)
generation_time
10534
tokens_prompt
77
tokens_completion
853
native_tokens_prompt
67
native_tokens_completion
760
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.00022218
router
(null)
provider_responses
0
endpoint_id
"722e0746-2034-4d6e-a08f-e6271849ee9e"
id
"e9ceed82-4fcd-4f30-bb89-6ca94e0be6a7"
is_byok
false
latency
277
model_permaslug
"deepseek/deepseek-v4-flash-20260423"
provider_name
"DeepSeek"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777124034-uXaVrh9jTuyYzMyylMQz"
session_id
(null)
api_type
"completions"
id
"gen-1777124034-tz7oS6xaaQY7P7zp3MMW"
upstream_id
"e9ceed82-4fcd-4f30-bb89-6ca94e0be6a7"
total_cost
0.00022218
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepSeek"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Matches word count
n/a
neededClean
false
words
501
0.00%
Dialogue to Total Word Ratio
Ratio: 84.16%, Deviation: 34.16%
neededClean
false
wordsTotal
505
wordsDialogue
425
49.9995%