NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-25-01KQ2HPRSVFTJ8BGD4RZB1XD09
Run Details
Model:
DeepSeek V4 Pro
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
32.5s
Cost:
$0.000872
Provider:
DeepSeek
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly traced a route across the napkin with a crayon she’d stolen from a Denny’s, her tongue peeking from the corner of her mouth in concentration. The schematic, if you could call the wobbly circle and several angry Xs a schematic, depicted a bouncy castle as the primary obstacle. “Alright,” she said, tapping the paper. “Primary ingress. You’ll be disguised as Sprinkles the Party Zebra. The costume has a strategically reinforced pouch in the crotch area for maximum present absorption. I’m talking Barbie Dreamhouses.” Bill, who was built like a lamppost that had lost a fight with a stronger lamppost, squinted. “The crotch-pouch again? My last gig as a furry, I was Tenderheart the Care Bear and got a Rubik’s cube stuck in that very pouch. The birthday boy, a tiny sociopath named Kevin, tried to solve it in situ. I chafed for a week. Why can’t I be a dinosaur? Dinosaurs have capacious tails.” “Because this isn’t a paleontology seminar with storage solutions, Bill. This is the sixth birthday of Penelope Ashworth-Ming, whose mother thinks zebras are ‘the thinking child’s horse.’ The zebra suit is our Trojan horse. We waltz in with the other biological party entertainers. You’ll make balloon animals to build rapport.” “I can only make a snake. A snake that looks aggressively like a tapeworm.” “Perfection. It’s a grim economic reality. Kids respect a parasite. Now, my role.” Molly tapped a second X on the napkin, one situated precariously close to a drawing of a cake with legs. “I am the catering captain from ‘Le Petit Gateau.’ I will be pushing a cart with the cake. The cake is a seven-tiered masterpiece of spun sugar and lies. The presents, which I’ve intel suggest are piled in a golden corral shaped like a unicorn pen, will be loaded into the bottom, hollow tier of the cake during the ‘Pin the Tail on the Donkey’ blindfold chaos.” Bill slumped. “Chaos? At Penelope Ashworth-Ming’s party, the blindfold chaos is scheduled for 2:17 PM, directly after the guided meditation and organic apple juice toast. We’re dealing with a professional mother. We need a diversion.” “My thoughts exactly.” Molly leaned in, the scent of her bubblegum lip gloss filling the tiny diner booth. “The petting zoo. They’ve got a miniature goat named Gandalf the Grey. I saw it on the booking form when I hacked the entertainer’s portal.” “You want to weaponize a goat?” “I want to weaponize a goat’s natural, diabolical curiosity. You, as Sprinkles, will accidentally-on-purpose leave the gate to the miniature unicorn present-pen unlatched. Then you will feed Gandalf the Grey a single, specific party favor from the gift bags.” Bill’s eyes widened with the slow dawn of criminal insight. “The edible glitter finger paint?” “The edible glitter finger paint,” Molly confirmed, sliding a small, eco-friendly-looking pouch across the table. “It’s ninety-eight percent sugar and two percent iridescent chaos. He’ll erupt from his pen like a tiny, bearded comet. Kids will scream. Parents will dive for their phones to capture the spectacle. Gandalf will begin head-butting the piñata in a hallucinatory frenzy. In that precise moment of goat-fuelled pandemonium, the lights go out.” “Why do the lights go out?” Molly smiled, a slow, brilliant, crooked thing. “Because I’ve pre-paid the face-painter, a woman who owes me a significant debt from a botched museum job involving a mummy and a rare papyrus, to paint a full-body cheetah pattern on the one child with anarchist tendencies. His final touch will be an ‘invisible cloak’ button, which I will tell him is a real invisibility switch but is actually the master circuit breaker to the garden tent. Darkness falls. You sweep the corral. I layer profiteroles over the loot. We rendezvous by the compost heap, disguised as an insensitive couple doing cultural appropriation themed yoga. We’re ghosts, Bill. Rich, sugar-high ghosts.” Bill stared at her for a long moment, a slow smile spreading across his gaunt face. “The chafing still concerns me.” “It’s the price of greatness, Sprinkles.”
created_at
"2026-04-25T14:46:41.781Z"
4/25/2026, 2:46:41 PM
model
"deepseek/deepseek-v4-pro-20260423"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1215
moderation_latency
(null)
generation_time
30276
tokens_prompt
77
tokens_completion
1089
native_tokens_prompt
67
native_tokens_completion
969
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.000872175
router
(null)
provider_responses
0
endpoint_id
"13bda129-f8a6-4a60-b1db-98082e5f7ccd"
is_byok
false
latency
102
model_permaslug
"deepseek/deepseek-v4-pro-20260423"
provider_name
"Together"
status
429
1
endpoint_id
"d084170b-dd99-4e53-9ae1-b63881f3cfbf"
id
"a3339a6a-8b7b-45cf-b387-ce5e88642ea0"
is_byok
false
latency
272
model_permaslug
"deepseek/deepseek-v4-pro-20260423"
provider_name
"DeepSeek"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777128401-iARgNmxH3oUlqRARSn9R"
session_id
(null)
api_type
"completions"
id
"gen-1777128401-j0zwVpVowK7VEkTFb50x"
upstream_id
"a3339a6a-8b7b-45cf-b387-ce5e88642ea0"
total_cost
0.000872175
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepSeek"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
667
0.00%
Dialogue to Total Word Ratio
Ratio: 77.57%, Deviation: 47.57%
neededClean
false
wordsTotal
682
wordsDialogue
529
0.0000%