NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-25-01KQ2F4RV88MZQW4T67VB3KGRE
Run Details
Model:
DeepSeek V4 Flash
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
21.3s
Cost:
$0.000219
Provider:
DeepSeek
Evaluation summary
Score
Evaluator
Details Link
43.38%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
21.6892%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly crinkled her nose at the glittery invite. “It says here, ‘No parents allowed. Only the fun ones.’” Bill rubbed the back of his neck. “That’s a trap. It’s a trap for people like us.” “We’re not parents, Bill. We’re thieves.” “Yeah, but we’re also fun. That’s even worse.” Molly tossed the invite onto the car’s dashboard. “The target is the Gift Pyramid. It’s a three-tiered structure. Eight-by-ten feet. Mostly craft kits and those singing plush llamas. The big prize is a fire-engine red remote-control monster truck at the apex.” Bill squinted at the backyard layout she’d drawn on a napkin. “You want me to go in through the bouncy castle?” “It’s the only way. The moms are set up on the patio. They have wine and a line of sight to the front gate. But nobody watches the castle.” “I’ll be bouncing up and down for ninety seconds. I’ll vomit.” “You’ll be fine. I’ll create a diversion with the magician.” “We don’t have a magician.” “I’ll be the magician. I’ve got the rabbit. And a false bottom for the gifts.” Bill tapped the napkin. “And me? What am I? Hufflepuff? No, wait—I’m the guy in the castle.” “You’re the grab-and-go. While I’m making a balloon poodle that’s secretly a packable sack, you bounce up, snag the truck, and then fall out of the castle into the shrubbery.” “What if a kid’s in there with me?” “Then you have a hostage who will cry for a juice box.” Bill ran a hand through his hair. “I hate children. They’re sticky and they smell like fruit snacks.” “Good. Your disgust will make you faster.” “And the llamas? The ones that sing?” “We dump them in the pool for a decoy. Let the parents think it’s a prank by a bored twelve-year-old.” Bill leaned back. “Why do we even want this truck? It’s for a kid.” “Because it’s worth eight hundred dollars on the collector’s market. And the seven-year-old birthday boy is named Chadwick. He already has one.” “We’re terrible people.” “We’re comedic heist people. There’s a difference.” “Is there?” “Yes. We don’t hurt them. We just confuse them. And we leave a note.” Bill perked up. “What’s it say?” Molly pulled out a laminated card from her jacket pocket. It read: “Sorry for the mix-up. This was for my nephew. – The Bouncy Castle Guy.” Bill laughed despite himself. “That’s not even a name.” “It’s a concept.” She checked her watch. “The magician starts at three. You ready to bounce?” “I’m ready to regret.” He picked up the napkin. “One more thing. What’s my code name?” “Bill.” “That’s not a code name.” “It is when you’re wearing a red wig and a fake mustache. No one will think it’s you.” “I have the mustache now.” “Then it’s already working.” She opened the car door. “Now get out. I need to practice my rabbit levitation.”
created_at
"2026-04-25T14:01:52.747Z"
4/25/2026, 2:01:52 PM
model
"deepseek/deepseek-v4-flash-20260423"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
619
moderation_latency
(null)
generation_time
11170
tokens_prompt
77
tokens_completion
827
native_tokens_prompt
67
native_tokens_completion
750
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.00021938
router
(null)
provider_responses
0
endpoint_id
"934a69f9-bd54-474b-beca-24560f721e12"
is_byok
false
latency
36
model_permaslug
"deepseek/deepseek-v4-flash-20260423"
provider_name
"DeepInfra"
status
(null)
1
endpoint_id
"722e0746-2034-4d6e-a08f-e6271849ee9e"
id
"29f1dd4e-cfd7-445f-a206-8f7543b97689"
is_byok
false
latency
274
model_permaslug
"deepseek/deepseek-v4-flash-20260423"
provider_name
"DeepSeek"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777125712-fx2LHgPNQieoA52cAGdd"
session_id
(null)
api_type
"completions"
id
"gen-1777125712-tj2Ir4NcugpzxeBGpIrP"
upstream_id
"29f1dd4e-cfd7-445f-a206-8f7543b97689"
total_cost
0.00021938
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepSeek"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
43.38%
Matches word count
n/a
neededClean
false
words
483
0.00%
Dialogue to Total Word Ratio
Ratio: 83.64%, Deviation: 33.64%
neededClean
false
wordsTotal
495
wordsDialogue
414
21.6892%