NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-25-01KQ2GT2RBHE99G8F41JVQHF8Y
Run Details
Model:
DeepSeek V4 Pro
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
27.1s
Cost:
$0.000926
Provider:
DeepSeek
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly squinted at the rough sketch on the napkin, which looked less like a blueprint for a criminal enterprise and more like a toddler’s rendering of a haunted pineapple. The primary target, a gargantuan inflatable unicorn named Glitterhoof, was represented by a spiky circle with a crayon smudge for a horn. “Walk me through the ingress again,” she said, tapping the drawing. “Specifically the part where we’re invisible.” Bill adjusted the pair of X-ray specs he’d mail-ordered from the back of a 1970s comic book, pushing them up his nose where they left a greasy smudge. “The key is saturation. They’re not expecting two extra grown-ups at the ‘My Little Equine Extravaganza’ party. It’s a volume business. A dozen sugar-crazed five-year-olds, a minimum-wage fairy princess, and a herd of exhausted parents. We are ghosts. Phantoms in the sticky, glitter-strewn machine.” “We’re a man with a fake mustache and a woman wearing a hastily repurposed bed sheet toga, Bill. We don’t look like ghostly specters. We look like we’re fleeing a low-budget toga party that got raided for health code violations.” Molly picked up a sugar cookie shaped like a horseshoe from the reconnaissance pile and bit into it cautiously. It was stale. “And this is our cover? ‘Assistant Pixie Dust Distributors’?” “It’s foolproof. I have the clipboard.” He held up a clipboard with a crisp, official-looking checklist. The top item read: 1. Distribute Pixie Dust. The rest of the page was a grocery list for nacho supplies. “Parents see a clipboard and a concerned expression, their brain just fills in the blanks. You’re my silent but industrious partner, Glinda. Your vibe is ‘don’t ask me, I’m union.’ My vibe is ‘logistical nightmare, please direct your complaints to head office.’ Between us, we’re a bureaucracy nobody wants to penetrate.” Molly finished the stale cookie, brushing crumbs from her tactical turtleneck. “Fine. We ghost in. We fill out a fake liability waiver for the bouncy castle. Then what? We just… hoist a seventy-pound, glitter-encrusted loot bag and waltz out?” “Timing is the art of the steal,” Bill said, leaning in conspiratorially. “Phase two is a masterstroke of psychological manipulation. I’ve greased the wheels with the birthday girl’s mum. Told her the party needs a ‘pinnacle event.’ A grand competition.” “A competition.” Molly’s voice was flat. “A screaming contest. I told her it’s a new Montessori thing for articulating emotional boundaries. We get them all in a line, count to three, and measure the decibel level of their pure, unbridled joy. Did you know the average happy child-scream hits about ninety decibels? A dozen of them, operating in concert, will scramble an adult’s inner ear like an egg.” He mimed an explosion with his hands, fingers wiggling. “Sensory chaos. A beautiful, golden window of total confusion.” Molly allowed herself a small, grim smile. “And while they’re all inhaling to rupture their own tonsils, we’re stuffing the plush mountain into the van disguised as a hyper-allergenic therapeutic pony party service?” “Precisely. You grab the unicorn. I’ll get the heavy stuff, the LEGO Friends Heartlake City Hospital. That set has over eight hundred pieces and a fifty-dollar resale value, unopened. I’ve already cried twice today just thinking about its intricate interior design.” He adjusted his false mustache, which was now listing to one side like a furry caterpillar having a stroke. “We’re not thieves, Molly. We’re unappreciated joy redistribution specialists. These kids have a short-sighted, hoarding mentality. We’re broadening their emotional horizons through minor, temporary adversity.” “That’s very noble,” Molly said, stuffing the napkin-blueprint into her pocket. “And when the fairy princess, who is almost certainly an off-duty dental hygienist with good reflexes, tackles you into the cake, what’s the contingency?” Bill didn’t miss a beat. “The cake is a sponge fortress. I’m willing to take a buttercream hit for the team. It’s part of the plan’s beautiful texture.” He checked his watch, a chunky digital thing with a built-in stopwatch for timed heists. “We deploy in twenty. Synchronise our spirit animals. Mine’s a badger. Yours?” Molly stood up, cracking her neck. “A very pragmatic vulture with a mortgage. Let’s go rob some toddlers.”
created_at
"2026-04-25T14:30:59.599Z"
4/25/2026, 2:30:59 PM
model
"deepseek/deepseek-v4-pro-20260423"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1375
moderation_latency
(null)
generation_time
27052
tokens_prompt
77
tokens_completion
1164
native_tokens_prompt
67
native_tokens_completion
1031
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.000926115
router
(null)
provider_responses
0
endpoint_id
"d084170b-dd99-4e53-9ae1-b63881f3cfbf"
id
"4ac1c99d-b468-4045-b2c1-5b6529364687"
is_byok
false
latency
280
model_permaslug
"deepseek/deepseek-v4-pro-20260423"
provider_name
"DeepSeek"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777127459-2LJvb4X6C9mKmDYoFHtA"
session_id
(null)
api_type
"completions"
id
"gen-1777127459-nWdJXObqbF7M7H9QV6wS"
upstream_id
"4ac1c99d-b468-4045-b2c1-5b6529364687"
total_cost
0.000926115
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepSeek"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
689
0.00%
Dialogue to Total Word Ratio
Ratio: 67.61%, Deviation: 37.61%
neededClean
false
wordsTotal
707
wordsDialogue
478
0.0000%