NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-25-01KQ2DSJMPRRE5F7M1RTM56V01
Run Details
Model:
DeepSeek V4 Pro (Reasoning)
Test:
Dialogue tags
Scenario:
Write 200 words with 90% dialogue
Duration:
13.6s
Cost:
$0.000522
Provider:
DeepSeek
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
36.79%
Dialogue to Total Word Ratio
(details)
18.3940%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
“Bill, we’re not just waltzing into a six-year-old’s unicorn party with a sack.” “Why not? I’ve got a glitter beard, they’ll think I’m a disgraced wizard. You’re the long-lost aunt with carpal tunnel from wrapping too many gifts. We hug, we grab, we waddle out.” “Their security is a bouncy castle. You’ll get motion sickness and confess everything.” “I’ll take a Dramamine. Now, the presents: pile them in the 'Magic Recycling Van'?” “It’s a 2003 Corolla with a muffler that screams. No magic. We need a distraction.” “I hired a pony.” “You hired a pony.” “A devious one. He’s going to trot through the cake, they’ll all chase him, and we’ll yell 'present inspectors' and stuff the loot into piñatas.” “They’re filled with candy, Bill.” “That’s why we bring our own empty piñatas. Look, I’ve drawn a schematic on this napkin. That’s you, that’s me, these are the screaming toddlers. The X is the mountain of toys.” “You’ve illustrated my nervous breakdown with a purple crayon.” “It’s lavender. It denotes stealth. Now, the grab-and-go: I’ll snatch the trampoline-wrapped tablet, you handle the deluxe chemistry set. In and out in five minutes, before the clown starts twisting balloon animals.” “Barry the Balloon Guy? We can’t cross Barry. He knows my mother.” “Fine. We’ll cut him in for a helium tank. He’ll make us poodle alibis. Are you in?” “Only if I get the top bunk in the getaway minivan.”
created_at
"2026-04-25T13:38:17.395Z"
4/25/2026, 1:38:17 PM
model
"deepseek/deepseek-v4-pro-20260423"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
956
moderation_latency
(null)
generation_time
13532
tokens_prompt
77
tokens_completion
596
native_tokens_prompt
67
native_tokens_completion
566
native_tokens_completion_images
(null)
native_tokens_reasoning
172
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.000521565
router
(null)
provider_responses
0
endpoint_id
"d084170b-dd99-4e53-9ae1-b63881f3cfbf"
id
"5ee371ae-d18d-4495-99c8-b9c3a5e8805a"
is_byok
false
latency
277
model_permaslug
"deepseek/deepseek-v4-pro-20260423"
provider_name
"DeepSeek"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777124297-ilB5CAsdqFD1U8aTtkCt"
session_id
(null)
api_type
"completions"
id
"gen-1777124297-8K7QraAittiJpQ0Af8Cu"
upstream_id
"5ee371ae-d18d-4495-99c8-b9c3a5e8805a"
total_cost
0.000521565
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepSeek"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
238
36.79%
Dialogue to Total Word Ratio
Ratio: 100.00%, Deviation: 10.00%
neededClean
false
wordsTotal
244
wordsDialogue
244
18.3940%