NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-25-01KQ2EXCN7DT32EP80FXCZRF53
Run Details
Model:
DeepSeek V4 Flash
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
20.0s
Cost:
$0.000198
Provider:
DeepSeek
Evaluation summary
Score
Evaluator
Details Link
0.01%
Matches word count
(details)
0.02%
Dialogue to Total Word Ratio
(details)
0.0144%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
“I still don’t understand why we can’t just hit a bank,” Bill said, adjusting the pink party hat on his head. It had a tiny, bobbing unicorn horn. “Because banks have alarms, Bill. Armed guards. Glass that screams when you look at it wrong,” Molly replied, holding a glittery invitation up to the dim light of their van. “What does a six-year-old’s birthday party have? A piñata full of cheap candy and an exhausted mother who has been drinking wine since noon.” “But presents, Mol. From other six-year-olds. We’re talking maybe a box of LEGOs and some half-colored coloring books.” “Targeted haul of the century. I’ve done recon. The birthday boy is named Brayden. His parents are influencer-level rich. The gift table is basically a small boutique for toddlers. One kid named Miles is bringing a gold-plated rattle. A rattle, Bill. For a baby who will never be born with a platinum spoon large enough.” Bill squinted. “How do you know what Miles is bringing?” “I have a fake mom account on the neighborhood app. I’ve been complimenting his mom’s ‘fruit tray design’ for three weeks. She thinks I’m kindred spirit named Brenda. I know the gift is wrapped in blue paper with yellow ducks.” Molly tapped the invitation. “We walk in as Uncle Porter and his friend from out of town. We do the limbo, we eat a single cupcake for alibi, and then we execute the ‘Bathroom Pass Gambit.’” “The what?” “I fake a call from a frantic husband. You fake a stomach ache. We duck into the hallway. I grab the presents. You block the door with your body. We’re out the back gate before Brayden’s mom realizes she’s out of gluten-free juice boxes.” Bill sighed, looking at his reflection in the van’s side mirror. His bald head gleamed under the party hat’s elastic strap. “I look like a sad clown who lost his car keys.” “You look like an uncle,” Molly said. “Uncles are sad. It’s part of the bit. Now, repeat the names.” “The birthday boy is not Bradley. It is Brayden.” “Good. Mother’s name?” “Keighleigh.” “Say it with a three-syllable groan.” “Kee-ley,” Bill groaned, holding his stomach. “It sounds like a sneeze.” “Perfect. The magician arrives at three. We have an eight-minute window while he sets up his sad rabbit. You’ll ask the dad where the spare toilet paper is. Dads love to give you toilet paper. It’s their only domestic win.” Bill rubbed his temples. “And if we get caught?” Molly smiled, pulling a pair of cheerful, oversized sunglasses from her pocket. “Then we’re just two very enthusiastic guests who got confused. We’ll cry. It’s a children’s party. Tears are part of the decor.” “I hate you.” “That’s the spirit. Now, let’s go steal a Furby that costs more than my rent.”
created_at
"2026-04-25T13:57:50.891Z"
4/25/2026, 1:57:50 PM
model
"deepseek/deepseek-v4-flash-20260423"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
596
moderation_latency
(null)
generation_time
9831
tokens_prompt
77
tokens_completion
760
native_tokens_prompt
67
native_tokens_completion
673
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.00019782
router
(null)
provider_responses
0
endpoint_id
"934a69f9-bd54-474b-beca-24560f721e12"
is_byok
false
latency
55
model_permaslug
"deepseek/deepseek-v4-flash-20260423"
provider_name
"DeepInfra"
status
(null)
1
endpoint_id
"722e0746-2034-4d6e-a08f-e6271849ee9e"
id
"8fbece9b-8d40-4938-a27a-2d7609f82b5d"
is_byok
false
latency
272
model_permaslug
"deepseek/deepseek-v4-flash-20260423"
provider_name
"DeepSeek"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777125470-rzf3yyInffCpOIUNnec4"
session_id
(null)
api_type
"completions"
id
"gen-1777125470-fIQy3bcQT6qhBv2l2LF8"
upstream_id
"8fbece9b-8d40-4938-a27a-2d7609f82b5d"
total_cost
0.00019782
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepSeek"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.01%
Matches word count
n/a
neededClean
false
words
469
0.02%
Dialogue to Total Word Ratio
Ratio: 83.09%, Deviation: 13.09%
neededClean
false
wordsTotal
479
wordsDialogue
398
0.0144%