NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-25-01KQ2FTTN4GBQHWZE8CAP2JB51
Run Details
Model:
DeepSeek V4 Flash
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
11.4s
Cost:
$0.000211
Provider:
DeepSeek
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0004%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
“You’re sure about this, Bill? Because my last ‘sure thing’ involved a ferret, a trampoline, and a very angry nun.” “First of all, the ferret was your idea. Second, this is a seven-year-old’s birthday party. The only security is a man dressed as a pirate who’s allergic to his own beard.” “I saw the pirate. He’s got a hook. And a peg leg.” “It’s a prop, Molly. He’s an accountant from Luton named Nigel. He gets winded from blowing up balloons.” “And the presents? You’ve confirmed the haul?” “A brand-new Xbox, a full LEGO Star Destroyer, a gift card to Build-a-Bear worth forty quid, and the crown jewel: a limited-edition, hand-painted My Little Pony that does a pooping sound when you squeeze its stomach. Retail value? Two hundred. Emotional value? Priceless to a small, screaming human.” “So we’re stealing from children. In front of other children. During a magic show.” “The magician is also a mime. He’s the worst of both worlds. He’ll be too busy weeping silently in the corner to notice. We go in during ‘Pin the Tail on the Donkey.’ Nigel is blindfolded. The parents are on their third glass of warm prosecco.” “And our exit strategy? You said you had a van.” “I have a Ford Fiesta with a clown horn I can’t disable. But it’s a fast clown horn. And I’ve rigged a diversion.” “What kind of diversion? Please don’t say a dozen rabid badgers.” “Better. I’ve trained a seagull to steal the birthday cake. It’s a very specific seagull. I found it at a chippy.” “You trained a seagull. With money. From the last job.” “It’s an investment. The cake is on a table next to the presents. The seagull grabs the cake, chaos erupts, we grab the bags, and we’re out before the birthday girl finishes her first meltdown.” “And what’s my part in this seagull-based symphony?” “You’re the look-out. And the moral support. And you’re wearing the inflatable Unicorn costume.” “Excuse me?” “It’s a kids’ party. You’re a roadworker. No, you’re a unicorn. We have to blend in. I’ll be the balloon animal guy.” “You can’t make balloon animals.” “I can make a balloon sword. And a very flaccid giraffe. It adds to my character. Look, this is the score of a lifetime. We’re not taking the pony for the money. We’re taking it for the *principle*.” “The principle of what?” “Taking things that defecate to sound effects. It’s art. Are you in or out?” A long pause. A sigh. “Fine. But if that seagull poops on my unicorn horn, I’m using the My Little Pony to send it to the big pie shop in the sky.” “That’s the spirit. Now, help me inflate the hind legs. The pump is in the glovebox, next to the bag of herring.”
created_at
"2026-04-25T14:13:55.497Z"
4/25/2026, 2:13:55 PM
model
"deepseek/deepseek-v4-flash-20260423"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
793
moderation_latency
(null)
generation_time
11295
tokens_prompt
77
tokens_completion
755
native_tokens_prompt
67
native_tokens_completion
719
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0002107
router
(null)
provider_responses
0
endpoint_id
"722e0746-2034-4d6e-a08f-e6271849ee9e"
id
"bb75b9fb-d01d-4d8b-a2a5-1c089d881065"
is_byok
false
latency
355
model_permaslug
"deepseek/deepseek-v4-flash-20260423"
provider_name
"DeepSeek"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777126435-DMWXa1rYls8H1IvSOFY7"
session_id
(null)
api_type
"completions"
id
"gen-1777126435-ZpbjFPU0nvOkyujXsPGY"
upstream_id
"bb75b9fb-d01d-4d8b-a2a5-1c089d881065"
total_cost
0.0002107
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepSeek"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
467
0.00%
Dialogue to Total Word Ratio
Ratio: 98.95%, Deviation: 28.95%
neededClean
false
wordsTotal
476
wordsDialogue
471
0.0004%