NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-25-01KQ2EQYD72MF1MHFS926ZT93V
Run Details
Model:
DeepSeek V4 Pro
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
30.0s
Cost:
$0.000896
Provider:
DeepSeek
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
They had reconvened, as all great criminal minds do, in the sticky red booth of a Waffle Hut that smelled faintly of burnt vanilla and floor cleaner. Between them, a single napkin sat bleeding with a diagram drawn in stolen purple crayon. Bill was squinting at it with the intensity of a man trying to translate ancient Sanskrit, his finger tracing a wobbly rectangle. Molly stabbed a blueberry with the tines of her fork. “Okay. Run it past me again, but slower, and with the emotional gravity it deserves. That’s the bouncy castle.” “Designated high-ground, yes,” Bill murmured, not looking up. “If things go sideways, I’ll commandeer the slide.” “They are four-year-olds, Bill. The only thing going sideways will be a juice box if someone laughs too hard. We are not storming Normandy.” She leaned forward, her voice a conspiratorial whisper. “We are ghosts. We drift in, we accumulate presents, we drift out. The birthday boy, a sticky-fingered tyrant named Leo, won’t even know his pile of loot has shrunk.” Bill finally met her eyes, his face a mask of profound concern. “But what about the clown? The itinerary you boosted from his mum’s email inbox clearly states there will be a professional clown. I don’t do jobs with clowns, Moll. It’s a variable I can’t control. Their feet are too big; their moral compass is a tiny, squeaky horn.” “The clown is our cover,” Molly hissed, pushing the napkin towards him. “That’s the genius of it. You’ll be the clown. You are a paranoid mess, but you can make a balloon poodle in under seven seconds, which is a deeply suspicious but currently valuable skill. I’ll be the harried party assistant, clipboard and all. I’ll be the one judging the ribbon on the best-wrapped present competition, discreetly redirecting the cream of the crop into our discreetly positioned ‘clean-up’ bin.” Bill’s eyes widened with a glimmer of terrified respect. “The long con.” “It’s a two-hour sugar rush. The attention span of our marks is measurable in seconds,” she said. “While you’re twisting balloon sabres and getting repeatedly punched in the kneecaps by tiny, ungrateful fists, I’ll be doing the real work. ‘Oh, little Timmy, that gift box is far too sharp-cornered for the soft-play area, let me put it somewhere safe.’ It’s the perfect, bureaucratic score.” A waitress slid a platter of disco fries onto the table with silent, greasy authority. Bill didn’t flinch, his mind clearly racing through tactical clown-shoes scenarios. “And the family dog?” he asked, picking up a fry and pointing it at her. “The dossier says it’s a terrier. Notoriously suspicious breed. What’s our canine counter-measure?” Molly sighed, a long, theatrical exhalation that ruffled the paper napkin. “It’s a miniature Schnauzer named Mr. Buttons who is afraid of the doorbell. His counter-measure is a pre-sliced, high-quality sausage secreted in my left sock. I’ve thought of everything. The parents will be chasing a limbo stick. The kids will be anesthetized by cake. The only real threat is you losing your nerve because the face paint makes your nose itch.” He chewed the fry slowly, a smile finally cracking his anxious facade. “So I waddle in, a maudlin pied piper of inflatable animals, and you fleece the birthday boy’s future.” “Precisely.” Molly scooped up the last of the blueberry mess. “It’s the great gift-salvage operation. We’re not just stealing a pile of plastic and plush toys, Bill. We’re liberating capital trapped in a repetition of unwanted trash. We’re economic Robin Hoods, if Robin Hood wore a red nose and had a deep-seated phobia of face-paint-borne pathogens.” Bill nodded, sliding out of the booth and pocketing the purple-crayon battle plan. “Right. For the record, if this clown gig gets me killed by a mob of feral toddlers hopped up on party bags, I’m haunting you. Not in a spooky way, just by constantly moving your car keys a foot to the left of where you left them.” “A risk I’m willing to take,” she said, flashing a grin as sharp as a stolen cake knife. “Now, let’s go find you some trousers that can withstand a tactical deployment of glitter.”
created_at
"2026-04-25T13:54:56.893Z"
4/25/2026, 1:54:56 PM
model
"deepseek/deepseek-v4-pro-20260423"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
783
moderation_latency
(null)
generation_time
25458
tokens_prompt
77
tokens_completion
1103
native_tokens_prompt
67
native_tokens_completion
996
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.000895665
router
(null)
provider_responses
0
endpoint_id
"d084170b-dd99-4e53-9ae1-b63881f3cfbf"
id
"f58007b9-509c-478b-9334-520657a74428"
is_byok
false
latency
424
model_permaslug
"deepseek/deepseek-v4-pro-20260423"
provider_name
"DeepSeek"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777125296-UXM1KQjIckuieqQMBarv"
session_id
(null)
api_type
"completions"
id
"gen-1777125296-y8jnv2MlAknCssB1HQz3"
upstream_id
"f58007b9-509c-478b-9334-520657a74428"
total_cost
0.000895665
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepSeek"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
691
0.00%
Dialogue to Total Word Ratio
Ratio: 69.92%, Deviation: 39.92%
neededClean
false
wordsTotal
708
wordsDialogue
495
0.0000%