NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-25-01KQ2ENFWVKGFZ55FZKCSQ7RKQ
Run Details
Model:
DeepSeek V4 Pro (Reasoning)
Test:
Dialogue tags
Scenario:
Write 200 words with 10% dialogue
Duration:
15.3s
Cost:
$0.000435
Provider:
DeepSeek
Evaluation summary
Score
Evaluator
Details Link
99.74%
Matches word count
(details)
99.68%
Dialogue to Total Word Ratio
(details)
99.7130%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 10% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly unfolded a crinkly napkin blueprint of the bouncy castle layout onto the park bench, using a stolen juice box to weigh down one corner. Bill squinted at the crayon-drawn X where the gift mountain would sit, his fake municipal clown shoes squeaking with every nervous shuffle. She tapped a frosting-smeared finger on the designated “Punch Bowl Blind Spot,” her expression deadly serious beneath a rainbow wig. “We strike during the piñata's third rotation.” Bill winced, adjusting his rainbow suspenders. “What if a kid cries?” Molly hushed him, scanning for toddlers. “That's the beauty. Maximum chaos.” She then produced a pair of industrial-strength elastic bands from her oversized daisy pocket, miming how they’d catapult the plush unicorn decoys over the fence to distract the birthday girl’s overprotective Labrador. Bill practiced his solemn face-paint-smearing scream, a diversion he called “The Giraffe of Despair.” Their getaway vehicle, a heavily modified stroller with a false-bottomed diaper bag, idled behind the shrubbery. Every wrapped box would vanish into a kaleidoscopic tumble of tissue paper and grabby hands, theirs being the greediest. The plan hinged on blending in, all giggles and honks, before making their squeaky escape down the gravel path.
created_at
"2026-04-25T13:53:32.063Z"
4/25/2026, 1:53:32 PM
model
"deepseek/deepseek-v4-pro-20260423"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1395
moderation_latency
(null)
generation_time
15253
tokens_prompt
77
tokens_completion
532
native_tokens_prompt
67
native_tokens_completion
467
native_tokens_completion_images
(null)
native_tokens_reasoning
183
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.000435435
router
(null)
provider_responses
0
endpoint_id
"d084170b-dd99-4e53-9ae1-b63881f3cfbf"
id
"d959d45b-def4-465f-8ffb-e3c7e0764e3e"
is_byok
false
latency
283
model_permaslug
"deepseek/deepseek-v4-pro-20260423"
provider_name
"DeepSeek"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777125212-lJiVGNW2TGHdtCP9YzO7"
session_id
(null)
api_type
"completions"
id
"gen-1777125212-KVMq6zj0d6y6HOQQNgVV"
upstream_id
"d959d45b-def4-465f-8ffb-e3c7e0764e3e"
total_cost
0.000435435
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepSeek"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
99.74%
Matches word count
n/a
neededClean
false
words
196
99.68%
Dialogue to Total Word Ratio
Ratio: 12.38%, Deviation: 2.38%
neededClean
false
wordsTotal
202
wordsDialogue
25
99.7130%