NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-25-01KQ2DZKMD8NPRGRHSZBCGCGDW
Run Details
Model:
DeepSeek V4 Pro (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
24.6s
Cost:
$0.000911
Provider:
DeepSeek
Evaluation summary
Score
Evaluator
Details Link
6.09%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
3.0453%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the crinkled blueprint across the sticky tabletop of the diner, nudging a syrup caddy aside with her elbow. The paper was actually a placemat from a pizza arcade, but on it she’d sketched the floor plan of the Pumpernickel Palace party room in ballpoint pen, complete with bouncy castle perimeter and a crayon X marking the gift table. Bill squinted at the drawing, then at his partner. “Remind me why we need a platypus costume.” “Because all the party entertainers are woodland creatures. Squirrel, raccoon, very judgmental badger. You show up as a duck-billed oddity, parents think it’s ironic. Children are confused but accepting.” Molly tapped the X. “You’ll be the platypus. I’m the fairy who grants mediocre wishes.” Bill dragged a french fry through a puddle of ketchup in slow, contemplative circles. “What’s my motivation? As the platypus.” “Your motivation is silent, existential bewilderment. Works every time. You waddle in, hand out biodegradable glitter, and edge toward the present mound while I distract with a bubble show.” She produced a tiny plastic wand from her jacket pocket and blew a single, sad bubble that popped immediately. “Enchanting, right?” “They’re going to ask questions. What does the platypus do? What sound does it make?” “You make no sound. You’re a monotreme. You convey profound inner life through subtle ear twitches.” She demonstrated, twitching nothing but her eyebrows. “The loot’s in gift bags, Bill. Gift bags with tissue paper. Tissue paper is silent. It’s the gold standard of birthday heists.” Bill chewed solemnly. “And the bouncy castle?” “Gravity-based exit strategy. I’ve calculated that if you sprint across it at a forty-five-degree angle while cradling an armload of plush unicorns, the rebound will launch you through that window.” She pointed to a jagged oval she’d drawn near the fire exit. “The window is decal, not code-compliant. You’ll land in the ball pit outside.” “The ball pit is part of the same franchise?” “Adjacent. ‘Sir Bounce-a-Lot’s Kingdom.’ You’ll blend immediately among the screaming toddlers. I’ll rendezvous at the mini-golf windmill with the goods.” Molly slid a napkin across the table. On it, she’d written a schedule in purple marker: *Platypus entrance: 2:45. Bubble crisis: 3:00. Extraction: 3:10.* “Cake service starts at three. The servers will be singing. Nobody hears tissue paper during the birthday dirge.” Bill lifted the napkin. “You’ve misspelled ‘heist’ as ‘hurrah.’” “Mood is everything.” Molly leaned in, her fairy wings crinkling under her coat. “You know what the real prize is, Bill? It’s not the plastic junk inside those bags. It’s the cards. The envelopes full of cash and gift certificates that nobody reads aloud because little Braxlynn is already shredding wrapping paper. By the time the parents realize the cards vanished, we’ll be sipping milkshakes.” “Will I still have the bill?” “The bill stays on. It’s your signature. Now, are you in, or do I have to wish upon my own mediocre star?” She extended a hand dusted with cheap glitter. Bill sighed, ran a finger under the collar of his imagined costume, and shook it. Outside the diner window, a minivan plastered with balloon decals rumbled past, utterly unsuspecting.
created_at
"2026-04-25T13:41:34.993Z"
4/25/2026, 1:41:34 PM
model
"deepseek/deepseek-v4-pro-20260423"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1366
moderation_latency
(null)
generation_time
24208
tokens_prompt
77
tokens_completion
1118
native_tokens_prompt
67
native_tokens_completion
1014
native_tokens_completion_images
(null)
native_tokens_reasoning
212
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.000911325
router
(null)
provider_responses
0
endpoint_id
"d084170b-dd99-4e53-9ae1-b63881f3cfbf"
id
"bb1c7c96-c6e4-4d0e-8adc-344ff7db31c7"
is_byok
false
latency
279
model_permaslug
"deepseek/deepseek-v4-pro-20260423"
provider_name
"DeepSeek"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777124494-3KRxLZrYXeVskgIxeVaS"
session_id
(null)
api_type
"completions"
id
"gen-1777124494-pRt05XHQvesDdOVp2y6K"
upstream_id
"bb1c7c96-c6e4-4d0e-8adc-344ff7db31c7"
total_cost
0.000911325
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepSeek"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
6.09%
Matches word count
n/a
neededClean
false
words
523
0.00%
Dialogue to Total Word Ratio
Ratio: 61.70%, Deviation: 31.70%
neededClean
false
wordsTotal
530
wordsDialogue
327
3.0453%