NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-25-01KQ2EH93EZ0TWFFWWP3HX3549
Run Details
Model:
DeepSeek V4 Pro (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
6m 25s
Cost:
$0.015319
Provider:
DeepSeek
Evaluation summary
Score
Evaluator
Details Link
86.38%
Matches word count
(details)
95.98%
Dialogue to Total Word Ratio
(details)
91.1821%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The laminated menu at Edna’s Diner was peeling at the corner, much like Molly’s patience. She spread a napkin across the syrup stains, a napkin covered in crayon diagrams of a bouncy castle and stick figures with tiny frowns. Outside, a minivan blared a birthday playlist. The diner smelled of regret, hash browns. A waitress with a name tag reading "Sheila" eyed them suspiciously from behind the pie case. "You see the bouncy castle positioned by the fire exit?" she whispered, tapping a waxy circle. Bill fished a green crayon from his jacket, the one he always carried for heist schematics, and shaded in a corner turret. "That’s where the clown makes balloon swords. We need to avoid him. Clowns unsettle my soul." Molly snatched the crayon, its tip snapping softly. "Clowns are the least of our worries. We spring into action during the piñata blindfold chaos." She jabbed a french fry at his chest, leaving a ketchup comma. "You’ll be the dinosaur." Bill deflated like a punctured whoopee cushion. He slumped against the vinyl booth, which sighed. "Why always the dinosaur? The costumes chafe my existential dread." He folded a paper napkin into a tiny, miserable swan. "Because kids trust a friendly dinosaur. You waddle in, hand out high-fives, and sidle up to the present mountain." Bill squinted at the napkin, tracing a path with a coffee-ring compass. "The present mountain is guarded by a six-year-old in a princess dress who bite-kicks strangers." "That’s why I’ll unleash the confetti-farting robot. She’ll be too mesmerized to enforce bite-kick policies." Molly smiled like a shark who’d just passed the bar exam. "And the cake?" Bill asked, already dreading the answer. "The cake is my domain. I’ll pose as a cake safety inspector with a clipboard and a lab coat stolen from a veterinary clinic." She paused to take a dramatic slurp of her milkshake, the straw rattling with chocolatey menace. "There’s a T-shirt cannon wielding dad on patrol," Bill muttered, spinning the salt shaker. A few grains scattered onto the table. "I know. I’ll engage him in a lengthy debate about piñata structural integrity until he’s weeping with bureaucratic nostalgia. Meanwhile, you swap the real gift-card piñata with our decoy filled with expired coupons." Bill considered this, his finger tracing a hasty unicorn in the condensation on his water glass. The diner's ice machine rumbled on cue. "What if the pony escapes early?" "That’s Phase Two, a beautiful symphony of distraction involving a rogue popcorn machine and a squirrel I’ve been training for weeks. Trust the squirrel." Bill sighed, a long, hollow sound that fluttered a receipt on the table. "I want hazard pay for the squirrel." "Hazard pay is an extra slice of cake." Bill stared at the swan napkin. "I’m in. But if I sneeze glitter for a week, I’m deducting sinus medication from my cut." "Deal. Now practice your adorable yet efficient larceny waddle." "Fine." Molly grinned.
created_at
"2026-04-25T13:51:19.546Z"
4/25/2026, 1:51:19 PM
model
"deepseek/deepseek-v4-pro-20260423"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1072
moderation_latency
(null)
generation_time
379617
tokens_prompt
77
tokens_completion
16230
native_tokens_prompt
67
native_tokens_completion
17575
native_tokens_completion_images
(null)
native_tokens_reasoning
16869
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.015319395
router
(null)
provider_responses
0
endpoint_id
"d084170b-dd99-4e53-9ae1-b63881f3cfbf"
id
"13125720-8991-49d6-96ae-1c680957f783"
is_byok
false
latency
355
model_permaslug
"deepseek/deepseek-v4-pro-20260423"
provider_name
"DeepSeek"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777125079-btxrucM1ENrx9VvRTF4c"
session_id
(null)
api_type
"completions"
id
"gen-1777125079-8j8NIAuKwFCCgUq4goQz"
upstream_id
"13125720-8991-49d6-96ae-1c680957f783"
total_cost
0.015319395
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepSeek"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
86.38%
Matches word count
n/a
neededClean
false
words
489
95.98%
Dialogue to Total Word Ratio
Ratio: 50.91%, Deviation: 0.91%
neededClean
false
wordsTotal
497
wordsDialogue
253
91.1821%