NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-25-01KQ2J8Q8HK1TQA443C6SGQB80
Run Details
Model:
DeepSeek V4 Pro (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
18m 9s
Cost:
$0.131224
Provider:
Together
Evaluation summary
Score
Evaluator
Details Link
99.74%
Matches word count
(details)
99.02%
Dialogue to Total Word Ratio
(details)
99.3827%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
In the dimly lit van, Molly peered through binoculars at the house, its windows festooned with cartoon banners. Bill sat beside her, nervously twisting a party hat into a spiral. "Timing is everything," Molly said. "We hit during the magician's act. Kids are hypnotized by cheap tricks, parents zoned out from the punch." Bill frowned. "Punch? I thought it was just cake and screaming. Last time, a kid called me a 'weird man,' and I had to swap my fake mustache." Molly gestured with the binoculars. "That's why we wear costumes. Clowns are perfect. We fill our shoes with presents and waltz out." "No clowns," Bill said, shaking his head. "The squeaking shoes will give us away. And I hate sticky face paint." "Fine," Molly said. "What about balloon twisters? We can distract them with animals." "I untwist, not twist," Bill replied. "And face paint gives me hives. I'll sneeze all over the kids—stealthy, that is not." Molly tapped the steering wheel. "So we go as relatives. Every family has a weird aunt and uncle. We bring a big empty box labeled 'Surprise' and swap it." Bill groaned. "Our plan is a box that says 'Surprise'? Kids will rip it open and bust us." "Not if it's full of rubber ducks," she said. "They'll be confused long enough to work." "Where do we get a thousand rubber ducks?" Bill asked. "And I can't carry presents; my back is shot." "Relax," Molly said. "We'll use a dolly if we have to. But you're wearing the inflatable muscle suit for effect." Bill shuddered. "The muscle suit? That thing makes me look like a deformed superhero. I'll sweat through it in seconds." "We park by the back gate and use the laundry chute," Molly explained. "You go in, I lower down the goods." Bill's eyes widened. "Me? Inside? With children? They'll corner me with sticky hands and questions. They're like tiny detectives." "You'll be Uncle Bumpy, the terrible magician," Molly said. "You entertain while I work." "Me? Magic?" Bill sputtered. "Last card trick, I lost the ace in my underwear for weeks." Molly sighed. "So tell bad jokes instead. Kids love groaning." Bill perked up. "Bad jokes I can do. But escape? Kids are fast and bitey." "We set off the smoke alarm during cake," she said. "Chaos, and we slip out." "Acting responsible?" Bill said, picking at his hat. "That's the hard part." "Just remember, Uncle Bumpy, the code word is 'sugar crash,'" Molly said. "When you hear it, make for the van." "Sugar crash," Bill repeated. "Got it. And if the kids chase us?" Molly started the van, the engine humming as Bill held the deflated muscle suit. Outside, a clown car honked. She pictured Uncle Bumpy, terrible magician, sweating through bad jokes. Bill grimaced, feeling the suit's plastic weight and the thousand rubber ducks in the back. The van joined the traffic, another vehicle heading to the party, but with a trunk full of mischief.
created_at
"2026-04-25T14:56:27.925Z"
4/25/2026, 2:56:27 PM
model
"deepseek/deepseek-v4-pro-20260423"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
340
moderation_latency
(null)
generation_time
1088650
tokens_prompt
77
tokens_completion
24477
native_tokens_prompt
146
native_tokens_completion
29754
native_tokens_completion_images
(null)
native_tokens_reasoning
23721
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.1312242
router
(null)
provider_responses
0
endpoint_id
"13bda129-f8a6-4a60-b1db-98082e5f7ccd"
id
"ogZoau1-6z2FuN-9f1e31cec02ec52c"
is_byok
false
latency
340
model_permaslug
"deepseek/deepseek-v4-pro-20260423"
provider_name
"Together"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777128987-xVxGsKMmB3mjP5iBH82R"
session_id
(null)
api_type
"completions"
id
"gen-1777128987-iGAr43ByRGhUvf7UVq2g"
upstream_id
"ogZoau1-6z2FuN-9f1e31cec02ec52c"
total_cost
0.1312242
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Together"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
99.74%
Matches word count
n/a
neededClean
false
words
496
99.02%
Dialogue to Total Word Ratio
Ratio: 69.56%, Deviation: 0.44%
neededClean
false
wordsTotal
496
wordsDialogue
345
99.3827%