NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-16-01KHM5KSN3VS60SMWXG29CR1ZG
Run Details
Model:
DeepSeek V3 (2025-03-24)
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
12.7s
Cost:
$0.000668
Provider:
SiliconFlow
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The dim glow of a flickering streetlight barely illuminated the alley where Molly and Bill hunched over a child’s crayon-drawn map of the birthday party venue. Molly, a wiry woman with a perpetually raised eyebrow, tapped the paper with a gloved finger. "Alright, genius," she said. "You’re telling me the crown jewel of our career is robbing a six-year-old’s birthday party?" Bill adjusted his ski mask like it was a top hat at the opera. "Not just any party, Molly. This is little Timmy Thompson’s *Extreme Dino-Fighter Bash*. You know what that means?" "No, Bill. Enlighten me." "Rich kids. Overcompensating parents. Piles of untouched presents *begging* to be liberated." Bill grinned. "Plus, have you seen the security? A pinata and a clown named Bubbles." Molly rubbed her temples. "And the escape route?" Bill pointed at the map. "Straight through the bouncy castle. No one expects a daring escape via inflatable dinosaur." Molly stared at him. "You’re serious." "Deadly. Look, we hit it during the cake—chaos, frosting, sugar-crazed kids. We grab the loot, bounce—literally—and vanish before the parents realize little Timmy got socks instead of a hoverboard." Molly sighed. "Fine. But if we get taken down by an angry mob of toddlers, I’m haunting you." Bill clapped her shoulder. "That’s the spirit! Now, disguises—I’m thinking full clown regalia." "Absolutely not." "C’mon, Mol. We blend right in! You’ll be Bubbles’ long-lost cousin, *Sniffles*." Molly’s eye twitched. "I will stab you." Bill ignored her. "Alright, revised plan. You distract Bubbles by ‘accidentally’ popping his balloon animals. I’ll sweep in, load the gifts into the Goody Bags of Holding—" "You sewed loot bags into actual loot bags?" "Classic misdirection!" Bill beamed. "Then, cake smash, chaos ensues, we hit the bouncy exit. Home free." Molly muttered something dark under her breath before nodding. "Fine. But if this goes south, I’m blaming the clown." Bill saluted. "Aye-aye, Captain Sniffles." A loud squeak interrupted them. They turned slowly to see a small boy in a dinosaur onesie staring, a half-eaten cupcake in his hand. "Are you bad guys?" he asked. Molly and Bill exchanged glances. Bill crouched conspiratorially. "Nah, kid. We’re… *present inspectors*. Making sure all the gifts are extra fun." The boy gasped. "Cool! Can I help?" Molly groaned as Bill high-fived him. This was going to be a disaster.
created_at
"2026-02-16T21:27:41.628045+00:00"
2/16/2026, 9:27:41 PM
model
"deepseek/deepseek-chat-v3-0324"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1957
moderation_latency
(null)
generation_time
12618
tokens_prompt
77
tokens_completion
672
native_tokens_prompt
66
native_tokens_completion
651
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0006675
router
(null)
provider_responses
0
id
"019c6859e84736e1953c45562310fc3a"
status
200
is_byok
false
latency
1957
endpoint_id
"d562908f-c5eb-4a32-9a91-24d53b176fd8"
provider_name
"SiliconFlow"
model_permaslug
"deepseek/deepseek-chat-v3-0324"
api_type
"completions"
id
"gen-1771277248-Bc1VSoZG2x5X8t5h69F8"
upstream_id
"019c6859e84736e1953c45562310fc3a"
total_cost
0.0006675
cache_discount
(null)
upstream_inference_cost
0
provider_name
"SiliconFlow"
Evaluation details
Result
Evaluator
Details
Meta Data
0.0000%
Matches word count
n/a
neededClean
false
words
397
0.0779%
Dialogue to Total Word Ratio
Ratio: 61.96%, Deviation: 11.96%
neededClean
false
wordsTotal
397
wordsDialogue
246
0.0389%