NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-25-01KQ2E3GK2QVVP3PVZ2DQ97F4A
Run Details
Model:
DeepSeek V4 Flash (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
18.7s
Cost:
$0.000249
Provider:
DeepSeek
Evaluation summary
Score
Evaluator
Details Link
14.30%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
7.1507%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly thumbed the edge of the party invitation, a glossy card with a cartoon unicorn shooting rainbows from its horn. “I’m telling you, Bill, this is the score of a lifetime.” Bill squinted at the address. “It’s a six-year-old’s birthday party. The only thing of value in there is a bouncy castle and a piñata filled with regret.” “Regret doesn’t buy new ski masks. Look at the registry.” Molly tapped the back of the card. “Gift bags. Each one has a gold-plated token, a stuffed animal, and a gift card to The Magic Bean. Those kids are loaded.” “Loaded with sugar and delusions. I’m not crashing a toddler’s tea party just to swipe a plush llama with googly eyes.” “It’s a plush llama with diamond-studded googly eyes, Bill. The Glitz & Giggles party planners charge four hundred a head. These parents are hedge fund managers. They don’t buy off-brand.” Bill sighed, rubbing his balding scalp. “And how exactly do we pull this off? We’re two middle-aged cat burglars with a shared sciatica. I can’t outrun a kid on a sugar high.” “We don’t outrun them. We outsmart them.” Molly unfolded a hand-drawn map of the Fancy Pants Event Center. “Clown arrives at two. Magic show at two-thirty. Present opening at three. We hit during the balloon animal wrap-up. Every kid is clutching a poodle made of latex. Parents are distracted, taking photos for Instagram. We sweep the gift table, swap the presents with these decoy boxes filled with rocks, and we’re out the back exit before the candles are lit.” Bill stared at the map. “Rocks? You want to replace a pile of presents with rocks?” “It’s called misdirection. They won’t notice until they open them at home. By then, we’re long gone, selling the gift cards on the dark web.” “The dark web doesn’t take Starbucks gift cards, Molly.” “The Magic Bean is artisan espresso. It’s high-end. Now focus. I need you on clown distraction. You’re going to pretend your car is being towed. Loudly. Make a scene.” “I have to pretend to be angry about a tow truck? That’s my natural state. Fine. What’s your fake identity?” Molly produced a bright pink wig and a sequined skirt from her duffel bag. “I’m Auntie Sparkles, the birthday party photographer. I’ll have a camera with a wide-angle lens. No one questions a woman with expensive equipment.” Bill squinted. “That’s… alarmingly well thought out. Have you done this before?” “Once. A Bar Mitzvah in Scarsdale. Got away with a set of silver candlesticks and a leather-bound Torah. Don’t look at me like that. It was a reproduction.” “A reproduction Torah. I’m the brains of this operation? We’re going to jail.” “Jail is a temporary setback. A lifetime supply of artisan coffee is forever. Now, do we have a plan or do I go solo?” Bill took a deep breath. “Fine. But when a six-year-old cries because her rainbow unicorn is now a rock, you’re the one carrying the guilt.” “Guilt has no place in a heist, Bill. Only presents. And presents are the children’s love language. We’re just translating it into profit.”
created_at
"2026-04-25T13:43:42.95Z"
4/25/2026, 1:43:42 PM
model
"deepseek/deepseek-v4-flash-20260423"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
998
moderation_latency
(null)
generation_time
16889
tokens_prompt
77
tokens_completion
935
native_tokens_prompt
67
native_tokens_completion
857
native_tokens_completion_images
(null)
native_tokens_reasoning
92
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.00024934
router
(null)
provider_responses
0
endpoint_id
"722e0746-2034-4d6e-a08f-e6271849ee9e"
id
"382f7a92-dd7e-4e57-becd-a2317d094fee"
is_byok
false
latency
420
model_permaslug
"deepseek/deepseek-v4-flash-20260423"
provider_name
"DeepSeek"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777124622-mxT1GDdj7NtEpwZVmqQ4"
session_id
(null)
api_type
"completions"
id
"gen-1777124622-RDLzNEMt3sJGJuzcZ3ij"
upstream_id
"382f7a92-dd7e-4e57-becd-a2317d094fee"
total_cost
0.00024934
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepSeek"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
14.30%
Matches word count
n/a
neededClean
false
words
521
0.00%
Dialogue to Total Word Ratio
Ratio: 85.79%, Deviation: 35.79%
neededClean
false
wordsTotal
535
wordsDialogue
459
7.1507%