NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-25-01KQ2ECD186QD8FWM2HBNN6JE9
Run Details
Model:
DeepSeek V4 Pro
Test:
Dialogue tags
Scenario:
Write 200 words with 90% dialogue
Duration:
13.4s
Cost:
$0.000472
Provider:
DeepSeek
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
36.79%
Dialogue to Total Word Ratio
(details)
18.3940%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
“Bill, the piñata. That’s where we start.” “Molly, it’s a paper mule. You want to whack it?” “No, you glorious idiot, I want to *be* it. We hollow out ‘Dazzle the Donkey,’ I climb inside, you haul me in as party entertainment.” “You, a forty-two-year-old woman, stuffed inside a donkey, at my niece’s seventh birthday?” “It’s the perfect Trojan horse, Bill. I’m surrounded by the target presents. I grab, I stuff them inside the hollow donkey with me, you swing me out.” “And the twenty tiny guests with sticky, watchful fingers?” “Blind them with science. Or, failing that, a cheap bubble machine on a fifteen-minute timer. It’s ‘Pop Palooza.’ Chaos. Zero visibility.” “They’ll still hear you. My sister’s kid, Lulu, she can hear a cookie packet rustle from three gardens away.” “Sound countermeasure: I’ll moo.” “Donkeys bray, Molly. They don’t moo. It’s a foundational difference.” “Even better. The sheer, unsettling wrongness of it will paralyse them in existential confusion. You’ll hoist the rope, the donkey ascends, stuffed with unicorn slime kits. We’re legends.” “You want me to hoist a live weight of one-fifty pounds of awkwardly packaged thief, plus plunder, on a single rope tied to a swing set?” “It’s the children’s own optimism holding us up, Bill. Don’t overthink the physics.” “The pin will tear. The donkey splits. You plummet onto the cake.” “Aim me for the bouncy castle.” “There is no bouncy castle. It’s a petting zoo.” “A petting… why didn’t you lead with the goats, Bill? This changes the entire acoustic landscape. Do goats alert the authorities?” “They just… judge. Silently. With those freaky, slotted pupils.” “Fine. We adjust. I’ll wear a tiny goat costume under the donkey costume. Classic shell game. They unzip a donkey, find a goat, minds melt, we waddle away with the Lego. Final question: can you get me a glue gun and three pounds of sparkly felt in the next twenty minutes?”
created_at
"2026-04-25T13:48:34.221Z"
4/25/2026, 1:48:34 PM
model
"deepseek/deepseek-v4-pro-20260423"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
828
moderation_latency
(null)
generation_time
13354
tokens_prompt
77
tokens_completion
540
native_tokens_prompt
67
native_tokens_completion
509
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.000471975
router
(null)
provider_responses
0
endpoint_id
"d084170b-dd99-4e53-9ae1-b63881f3cfbf"
id
"0d3c19c1-22cd-432f-ad21-d797100b9135"
is_byok
false
latency
275
model_permaslug
"deepseek/deepseek-v4-pro-20260423"
provider_name
"DeepSeek"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777124914-sE0N7IRfknJwRhL7tRig"
session_id
(null)
api_type
"completions"
id
"gen-1777124914-w09IkZ0kSPUjOExcI9h5"
upstream_id
"0d3c19c1-22cd-432f-ad21-d797100b9135"
total_cost
0.000471975
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepSeek"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
320
36.79%
Dialogue to Total Word Ratio
Ratio: 100.00%, Deviation: 10.00%
neededClean
false
wordsTotal
325
wordsDialogue
325
18.3940%