NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-25-01KQ2EX9ECE56VXBTCJH9108G2
Run Details
Model:
DeepSeek V4 Pro (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
13m 45s
Cost:
$0.107829
Provider:
Together
Evaluation summary
Score
Evaluator
Details Link
86.38%
Matches word count
(details)
89.94%
Dialogue to Total Word Ratio
(details)
88.1595%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The basement of the church was lit by flickering fluorescent lights that buzzed like a swarm of angry bees. The air smelled of stale juice and vanilla icing. Molly and Bill crouched behind a stack of folding chairs, peering into the party room through a crack in the door. Streamers hung limply from the ceiling, and a table overflowed with presents wrapped in princess paper and superhero themes. The sound of children singing "Happy Birthday" echoed off the concrete walls, their voices slightly off-key. Molly's eyes gleamed with greed as she scanned the room. Molly whispered, "You see that pile of gifts? It's glistening with potential. Each box is at least twenty bucks on eBay." Bill adjusted his fake mustache. "Twenty bucks? I'm risking my parole for twenty bucks? There better be an iPad in there or I'm out." "There's a tablet in the largest box. I checked the registry. But we need to wait for the cake. That's when the kids will be distracted, their mouths full of frosting." "Frosting is the enemy, Molly. It gets everywhere. Remember the donut job? I'm still finding sprinkles in my coat pockets." "Sprinkles are a hazard of our profession. Stay focused on the prize." "Focused? I'm focusing on keeping my hands clean. Stickiness slows down the heist." "This is different. The cake is a three-tier castle with jam filling. It'll keep them busy for ten minutes. That's our window." Bill sighed, rubbing his temples. "Ten minutes? In and out? With my luck, the sugar will kick in and they'll become a swarm of hyperactive candy zombies." "That's why I have the smoke bomb. It's a non-toxic fog that creates a calming atmosphere. They'll be too mellow to notice us." "Non-toxic? You said that about the 'sleeping gas' at the pet shop. We woke up three days later with a room full of angry puppies." Molly smirked. "That was a miscalculation. This smoke is just water vapor and a dash of lavender. It's basically aromatherapy." "Fine. But I'm wearing a gas mask under my clown wig. I'm not breathing that stuff." "Deal. Now, on my signal, we dash to the present pile. You grab the tablet, I'll take the rest." "What if the kids start opening presents early? Some kids are impatient." "Then we adapt. You create a diversion with a balloon animal, and I'll snatch the goods." "I can't make balloon animals. I can't even blow one up without getting dizzy." "Then just juggle. Anything to keep their eyes off the presents." Bill shook his head. "I'm starting to think this is a bad idea. Stealing from kids is a new low, even for us." Molly grabbed his arm. "It's not stealing, it's redistribution of joy. These kids have too many toys anyway. Now, let's go." They crept toward the party, the children's laughter growing louder. Bill's bear suit squeaked with each step, a sound he desperately tried to muffle. Molly clutched the smoke bomb in her hand, ready to unleash calmness. The party was just feet away, the present pile within reach.
created_at
"2026-04-25T13:57:47.599Z"
4/25/2026, 1:57:47 PM
model
"deepseek/deepseek-v4-pro-20260423"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1866
moderation_latency
(null)
generation_time
823895
tokens_prompt
77
tokens_completion
21914
native_tokens_prompt
146
native_tokens_completion
24437
native_tokens_completion_images
(null)
native_tokens_reasoning
21142
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.1078294
router
(null)
provider_responses
0
endpoint_id
"d084170b-dd99-4e53-9ae1-b63881f3cfbf"
is_byok
false
latency
265
model_permaslug
"deepseek/deepseek-v4-pro-20260423"
provider_name
"DeepSeek"
status
429
1
endpoint_id
"13bda129-f8a6-4a60-b1db-98082e5f7ccd"
id
"ogZVYVn-6z2FuN-9f1ddbde41ce8775"
is_byok
false
latency
1866
model_permaslug
"deepseek/deepseek-v4-pro-20260423"
provider_name
"Together"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777125467-LpbikVkmKeBIFRlElONG"
session_id
(null)
api_type
"completions"
id
"gen-1777125467-k46RPLrdFUFN2LKWSATb"
upstream_id
"ogZVYVn-6z2FuN-9f1ddbde41ce8775"
total_cost
0.1078294
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Together"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
86.38%
Matches word count
n/a
neededClean
false
words
511
89.94%
Dialogue to Total Word Ratio
Ratio: 68.54%, Deviation: 1.46%
neededClean
false
wordsTotal
515
wordsDialogue
353
88.1595%