NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-25-01KQ2DHHFGZPAYBQBZS4JDCJ1H
Run Details
Model:
DeepSeek V4 Flash
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
9.8s
Cost:
$0.000200
Provider:
DeepSeek
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly pressed her eye to the gap in the fence, watching a bouncy castle inflate with a series of flatulent wheezes. “It’s a fortress of giggles, Bill. A sugar-fueled Alcatraz.” Bill, wedged behind a dumpster, adjusted his fake mustache. It was already peeling at the corner. “I’ve got the plans. The gift table is by the cake station, under a banner that says ‘Happy Birthday, Chloe.’ Who names their kid Chloe anymore?” “Rich people,” Molly whispered. “The ones who buy pileable gift boxes from that store that smells like sandalwood and regret. The loot is premium. We’re talking $40 board games and finger puppets made from alpaca wool.” Bill snorted. “Alpaca wool? We’re stealing from toddlers for alpaca wool.” “Don’t be classist. A heist is a heist.” Molly slid a laminated map from her jacket. “Here’s the entry. We go over the fence at 2:07 PM, when the magician starts his act. He does the dove thing, parents clap like seals, kids get distracted. That’s our window.” “And if the magician sees us? What then? I’m not fighting a man in a cape.” “You won’t. The cape is velvet. It’s a trip hazard. We’re professionals, Bill. We wear sensible shoes.” Molly pointed at her scuffed trainers. “Sparks’ self-tie loafers. Had them since ‘06.” Bill rubbed his temples. “Okay, okay. We’re in. We grab the gifts. How do we get them out without looking like we just knocked over a Build-a-Bear?” “The dog.” Molly grinned. “The dog?” Bill’s eyes went wide. “The real, slobbering golden retriever they hired for emotional support? That dog?” “Exactly. I’ve rigged a harness under his belly. We attach a canvas sack. The dog walks out, wags his tail, and we follow like we’re just super affectionate owners. No one questions a man and a woman cooing over a dog.” Bill stared at her. “You’re insane.” “And you’re in.” Molly checked her watch. “We’ve got 45 minutes. I need you to go buy a party horn. The long, curly kind. Red.” “Why a party horn?” “Because when the magician brings out the rabbit, I’m going to blow it loudly and yell, ‘Surprise!’ Everyone will look. You slide the sack under the table. It’s misdirection with a double dip of celebration.” Bill opened his mouth, closed it, then nodded slowly. “Fine. I’ll get the horn. But if I end up arrested for stealing a bag of kinetic sand and a plush narwhal, I’m telling them the whole thing was your idea. And that you cried during the bouncy castle setup.” Molly patted his cheek. “That’s the spirit. Now go. And Bill? Get the good horn. The one that actually makes a sound that sounds like a dying goose. It’s more confusing.”
created_at
"2026-04-25T13:33:54.036Z"
4/25/2026, 1:33:54 PM
model
"deepseek/deepseek-v4-flash-20260423"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
575
moderation_latency
(null)
generation_time
9706
tokens_prompt
77
tokens_completion
729
native_tokens_prompt
67
native_tokens_completion
682
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.00020034
router
(null)
provider_responses
0
endpoint_id
"722e0746-2034-4d6e-a08f-e6271849ee9e"
id
"55ef9601-094e-48fc-a736-d060d9df060d"
is_byok
false
latency
274
model_permaslug
"deepseek/deepseek-v4-flash-20260423"
provider_name
"DeepSeek"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777124034-sawmWJFVshQ2XNcrwGlu"
session_id
(null)
api_type
"completions"
id
"gen-1777124034-WUB2fnpRBpoAp2DmmVFv"
upstream_id
"55ef9601-094e-48fc-a736-d060d9df060d"
total_cost
0.00020034
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepSeek"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
453
0.00%
Dialogue to Total Word Ratio
Ratio: 81.18%, Deviation: 51.18%
neededClean
false
wordsTotal
457
wordsDialogue
371
0.0000%