NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-25-01KQ2FCAJ9D2X9M8AXRVDX8ZZ0
Run Details
Model:
DeepSeek V4 Flash (Reasoning)
Test:
Dialogue tags
Scenario:
Write 200 words with 90% dialogue
Duration:
2m 52s
Cost:
$0.000129
Provider:
DeepInfra
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
99.48%
Dialogue to Total Word Ratio
(details)
49.7380%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
“Bill, you’re telling me we’re robbing a seven-year-old’s birthday party?” Molly adjusted her earpiece, testing the static. “Not robbing, Molly—strategically redistributing wrapped assets.” Bill unfolded a crudely drawn map of the community center. “The parents do drop-off at 2:15. We slip in via the kitchen, swap the gift bag with this identical decoy bag, and exit before the clown arrives.” “The clown. You mentioned a clown, Bill. Is the clown armed?” “Worse. He does balloon animals. Threat is minimal. I’ll be wearing my UPS uniform. You’re the catering helper.” “I look nothing like a catering helper. I look like someone who just stole a catering helper’s jacket.” “So act like you stole it. Improvise. The real target is the disco-themed bounce house. It’s a distraction.” “A distraction from what? The presents or the fact that we’re stealing from toddlers?” “Both. Timing is key. When the kids are doing the chicken dance, we empty the present table.” “And if one of those kids looks at me with those big, trusting eyes?” “You freeze and whisper ‘This is for the greater good’ and grab the Lego sets.” “Greater good. We’re stealing a My Little Pony playset.” “That playset scalps for eighty bucks online, Molly. Think of the principle.” “I’m thinking of the therapy bills.” She sighed. “Fine. But if I get face-painted, you owe me dinner.” “Deal. Now put on this wig. You’re ‘Carla from Snack Solutions.’” “I hate Carla from Snack Solutions. She has no pension.” “She has access to the gluten-free cupcakes. Go.”
created_at
"2026-04-25T14:06:00.269Z"
4/25/2026, 2:06:00 PM
model
"deepseek/deepseek-v4-flash-20260423"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
4667
moderation_latency
(null)
generation_time
172150
tokens_prompt
77
tokens_completion
497
native_tokens_prompt
67
native_tokens_completion
427
native_tokens_completion_images
(null)
native_tokens_reasoning
50
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.00012894
router
(null)
provider_responses
0
endpoint_id
"934a69f9-bd54-474b-beca-24560f721e12"
id
"e19648c7f3b347589ceafbe0829a4feb"
is_byok
false
latency
43
model_permaslug
"deepseek/deepseek-v4-flash-20260423"
provider_name
"DeepInfra"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777125960-hlDlVKRr2wq3T3PE9AR0"
session_id
(null)
api_type
"completions"
id
"gen-1777125960-mNLxbevURA4Vz9YSNUhv"
upstream_id
"e19648c7f3b347589ceafbe0829a4feb"
total_cost
0.00012894
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepInfra"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
254
99.48%
Dialogue to Total Word Ratio
Ratio: 92.69%, Deviation: 2.69%
neededClean
false
wordsTotal
260
wordsDialogue
241
49.7380%