NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-25-01KQ2FVC2P050XATN9WX4HESP0
Run Details
Model:
DeepSeek V4 Flash
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
11.0s
Cost:
$0.000223
Provider:
DeepSeek
Evaluation summary
Score
Evaluator
Details Link
20.19%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
10.0948%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
You sure about this, Molly? Bill shifted his weight on the curb, peering at the glitter-festooned house across the street. It looked less like a fortress and more like a unicorn had sneezed. Molly, checking the lockpicks in her jacket, rolled her eyes. Bill, a child’s birthday party. High stakes? No. High security? Negative. High sugar content? Absolutely. The presents will be sitting in a pile by the bounce house, unguarded. Unattended, maybe. But we’re dealing with a demographic that can scream loud enough to pop an eardrum. We’ll be in and out before the candle-lighting song. That’s the golden window. Trust me. Trust you? Last time you said we were stealing a Fabergé egg from a silent auction. It was a Fabergé egg-shaped bouncy ball from a pet store grand opening. And we sold it for forty bucks. Profit is profit. Forty bucks and a three-day headache from a screeching cockatoo. What’s the plan for the children? Tranquilizer darts? Don’t be dramatic. We wear these vests. She held up a pair of bright, puffy vests with cartoon frogs on the front. Entertainer vests. The parents will think we’re part of the clown act. We walk in, we parade around, we gather the loot for the ‘grand prize game.’ And if the real clown shows up? Then the clown has a very bad day. We’re on the guest list—sort of. I heard the mother, Mrs. Patterson, talk about her cousin’s juggling friend. She’s expecting a tall, goofy man and a short, grumpy woman. That’s us. Grumpy? I’m not grumpy. I’m professionally skeptical. What if a kid tries to hug us? Embrace it. Rustle their hair. Call them a little champion. Then, when they toddle off for a juice box, we stuff the gift bags. It’s an art, Bill. It’s a felony with sprinkles. And the presents. We’re just stealing from children. They’re mostly duplicates. How many plastic dinosaurs does one kid need? We’re culling the herd. We’re a necessary evil. A necessary evil in a vest with a frog wearing a monocle. Fine. What’s the exit strategy? We’ll be the first people to leave the party. We smile, wave, grab the big red wagon full of loot, and push it straight to the van. We say ‘Thank you for having us, we had a ball!’ We had a ball. Very punny. And the getaway? The van is behind the hedge. I’ve already disabled the neighbor’s Ring doorbell with a focused jammer. A focused jammer. For a children’s party. For the principle of the thing, Bill. Never leave a loose end. Now look alive. I see a minivan pulling up. That’s the first wave of sugar fiends. Bill sighed, pulling the vest over his thrift-store jacket. Alright. But if I have to make balloon animals, I’m blaming you. And I only know how to make a wiener dog. Perfect. That’s a comedy classic. Now, let’s go make some memories. And by memories, I mean merchandise. Molly strolled across the street, and Bill followed, muttering, Memories. I’ll give them memories. Of me, sweating in a frog vest, stealing their Easy-Bake Oven.
created_at
"2026-04-25T14:14:13.338Z"
4/25/2026, 2:14:13 PM
model
"deepseek/deepseek-v4-flash-20260423"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
534
moderation_latency
(null)
generation_time
10863
tokens_prompt
77
tokens_completion
799
native_tokens_prompt
67
native_tokens_completion
763
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.00022302
router
(null)
provider_responses
0
endpoint_id
"722e0746-2034-4d6e-a08f-e6271849ee9e"
id
"677afa53-4555-4e23-894c-9e31480780dc"
is_byok
false
latency
270
model_permaslug
"deepseek/deepseek-v4-flash-20260423"
provider_name
"DeepSeek"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777126453-qIyOeg6Z288tBx8aGSmq"
session_id
(null)
api_type
"completions"
id
"gen-1777126453-lSiImeWNX88jaiFc2LQj"
upstream_id
"677afa53-4555-4e23-894c-9e31480780dc"
total_cost
0.00022302
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepSeek"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
20.19%
Matches word count
n/a
neededClean
false
words
520
0.00%
Dialogue to Total Word Ratio
Ratio: 2.28%, Deviation: 67.72%
neededClean
false
wordsTotal
526
wordsDialogue
12
10.0948%