NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-25-01KQ2G9WEXHKHXG36WCXY4FD9Y
Run Details
Model:
DeepSeek V4 Pro
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
31.0s
Cost:
$0.000894
Provider:
DeepSeek
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly ran the tip of her tongue across the serrated edge of a candy cane, narrowing her eyes at the oversized gingerbread house squatting on the country club’s east lawn. It was a diabetic’s fever dream, slathered in so much frosting and gumdrop masonry it practically vibrated with the screams of a dozen sugar-crazed six-year-olds. “I’m just saying, the thermal lance might be overkill for a bouncy castle,” she said, passing the candy cane to Bill. Bill, who was sweating profusely through a rented elf costume, accepted the sweet and crunched into it savagely. The little bell on the tip of his curled shoe tinkled with a pathetic lack of festive cheer. “It’s not for the bouncy castle, Moll. It’s for the piñata. I’ve done the reconnaissance. That’s not papier-mâché and wishful thinking. That’s a reinforced carbon-fiber shell stuffed with the season’s hottest Hatchimals and, I’m reliably informed, at least fourteen envelopes containing U.S. Savings Bonds. It’s a piñata of substantial net worth.” “A sugar-frosted Fort Knox dangling from a string,” Molly mused, flicking a stray sprinkle from her tactical turtleneck. “So we melt through the shell, snatch the hatchlings, and laminate ourselves out on a slip-n-slide of tears. What’s the entry vector?” Bill unfolded a napkin covered in diagrams drawn in what looked like violently purple frosting. “Okay. You see the clown by the inflatable velociraptor lagoon? His name is Giggles. He’s making balloon poodles. I’ve secured a perfect duplicate of his costume. It’s all about the oversized shoes, Moll. The shoes are a cargo dream. I can fit a portable vacuum sealer and seventeen whoopee cushions in each toe without affecting the line.” He wiggled his own elf shoe, which squeaked. Molly took the napkin and rotated it three times. “You’re going to infiltrate as Giggles, create a diversionary tragedy with a popped balloon poodle, and then we go for the piñata? The timing is suicidal. The birthday boy, Archibald something-or-other, has a cake-cutting ceremony in eleven minutes. The parents will be a solid wall of iPhone-wielding paparazzi.” “That’s the genius,” Bill said, his eyes gleaming. “We don’t go for the piñata *during* the cake. The cake *is* the diversion. I’ve paid off a mole.” Molly squinted. “You paid off a child?” “A disgruntled party magician’s assistant. Ten years old, goes by ‘The Great Raymondo.’ He’s got access to the pyrotechnics. For fifty bucks and a convincing fake vomit patch, he’s agreed to accidentally-on-purpose fire a glitter cannon directly into the control panel of the chocolate fountain. The resulting fondue-spray will cause a panicked stampede. In the chaos, you, dressed as the distraught mother of a lost child, sprint straight for the piñata, screaming, ‘My son’s inside! He’s allergic to milk solids!’ It’s emotionally manipulative and just medically plausible enough to make them pause.” A slow, appreciative smile spread across Molly’s face. “And while they’re forming a search party for a lactose-intolerant phantom, I’m on a ladder with the thermal lance. I like the misdirection, Bill. It’s cruel, it’s specific, and it leans into the natural panic of a children’s party. What about the weight? We can’t simply waddle off a country club lawn with a sack of savings bonds and robotic owls.” “Gluten,” Bill said, with the gravity of a man disclosing state secrets. “‘Gluten’?” “I’ve parked a replica of a gluten-free, allergen-safe snack van behind the hedge maze. We’re talking official decals, a laminated nut-free certification, the works. We load the haul into boxes labeled ‘Vegan Confetti Bark.’ No one questions a snack van at a children’s function. It’s an invisible ghost ship of dietary accommodation. We drive away at a sensible five miles per hour, blending into the very fabric of parental anxiety.” He adjusted his pointy ears and handed her a compact mirror and a hypoallergenic wipe. “Now, please rub some of this cake smudge onto your cheek. You need to look more bereaved and a little less like a cat burglar who just spotted a security flaw in the face-painting queue.”
created_at
"2026-04-25T14:22:08.865Z"
4/25/2026, 2:22:08 PM
model
"deepseek/deepseek-v4-pro-20260423"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1224
moderation_latency
(null)
generation_time
30769
tokens_prompt
77
tokens_completion
1108
native_tokens_prompt
67
native_tokens_completion
994
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.000893925
router
(null)
provider_responses
0
endpoint_id
"13bda129-f8a6-4a60-b1db-98082e5f7ccd"
is_byok
false
latency
103
model_permaslug
"deepseek/deepseek-v4-pro-20260423"
provider_name
"Together"
status
429
1
endpoint_id
"d084170b-dd99-4e53-9ae1-b63881f3cfbf"
id
"ce3e9474-e5d0-4c66-a562-0631c5bbad51"
is_byok
false
latency
370
model_permaslug
"deepseek/deepseek-v4-pro-20260423"
provider_name
"DeepSeek"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777126928-kOs3dQrEP2fAWiyYUtFy"
session_id
(null)
api_type
"completions"
id
"gen-1777126928-r7bTbGLWGFq0uiN73m6s"
upstream_id
"ce3e9474-e5d0-4c66-a562-0631c5bbad51"
total_cost
0.000893925
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepSeek"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
668
0.00%
Dialogue to Total Word Ratio
Ratio: 73.11%, Deviation: 43.11%
neededClean
false
wordsTotal
688
wordsDialogue
503
0.0000%