NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-25-01KQ2HNC3ADDDWB2F25M562V3W
Run Details
Model:
DeepSeek V4 Flash (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
12.8s
Cost:
$0.000257
Provider:
DeepSeek
Evaluation summary
Score
Evaluator
Details Link
1.04%
Matches word count
(details)
2.60%
Dialogue to Total Word Ratio
(details)
1.8167%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly sat on a pile of flattened cardboard boxes, a party hat tilted rakishly on her head. Bill was pacing in the narrow space behind the dumpster, his arms full of noisemakers. “I’m telling you, Bill, the layout is perfect,” Molly said, tapping a crudely drawn map on a napkin. “The bounce house blocks the view from the kitchen, and the piñata is strictly a diversion.” Bill stopped pacing. “A diversion for what? To distract the five-year-olds? They’re already distracted. They’re on sugar. They have the attention span of a gnat on a sugar high.” “Exactly,” Molly said, grinning. “But the parents are the problem. They see everything. Especially Cheryl, the mom with the gluten-free cupcakes. She’s got eyes in the back of her head.” Bill held up a noisemaker. “So I blow this, and everyone looks at me. Then you swoop in from the hedge.” “From the hedge?” Molly scoffed. “That hedge is full of thorns. I’m not ruining my new jumpsuit for a My Little Pony mega-stable. No, I’ll be coming from the trampoline. I’ll use the cover of a ‘cannonball’ competition.” “You’re going to jump on a trampoline?” Bill’s voice cracked. “Molly, you’re forty-two. Your knees sound like bubble wrap.” “My knees are fine. I’ve been stretching. Look.” Molly stood and attempted a lunge. Her right knee popped loud enough to startle a pigeon. She winced. “Okay, fine. Plan B. You create a distraction at the cupcake table. I’ll belly-crawl under the buffet tent. The gifts are stacked on a plastic table near the clown.” “The clown?” Bill’s eyes went wide. “You didn’t mention a clown. I have a thing about clowns. It’s not funny. It’s a genuine phobia.” “It’s a balloon clown, Bill. Larry the Balloon Guy. He’s harmless. He’s wearing a rainbow wig.” “Rainbow wigs are even worse. They mean he’s happy. Happy clowns are the most unhinged. They smile while they deflate your soul.” Molly sighed. “Fine. I’ll handle Larry. You handle the parents. All you have to do is accidentally spill the fruit punch on Cheryl’s white jeans.” “That’s assault,” Bill said, pointing a noisemaker at her like a gun. “That’s premeditated laundry destruction.” “It’s a strategic maneuver,” Molly corrected. “While she’s scrubbing, I grab the gift bags. The big one with the unicorn wrapping paper is our target. Inside: a limited-edition singing Pikachu. Online it’s going for four hundred.” Bill’s eyes lit up. “Four hundred?” “And a gift card to Build-a-Bear. Twenty bucks.” “Okay, I’m in. But I’m not touching the clown. I’ll do the punch spill. That’s my line.” “Deal,” Molly said, rolling up the napkin map. “We go at 2:15. Right after the ‘Happy Birthday’ song. Everyone is crying or clapping. No one’s watching the trampoline.” “And if we get caught?” Molly smiled, adjusting her party hat. “Then we tell them we’re the entertainment. Uncle Molly and Uncle Bill. Show up, do a terrible magic trick, leave with presents. It’s a win-win.” Bill stared at her. “You don’t know any magic tricks.” “I know one. I can make a Pikachu disappear.” “That’s stealing.” “Exactly. Now get your game face on. We’ve got a birthday to crash.”
created_at
"2026-04-25T14:45:53.902Z"
4/25/2026, 2:45:53 PM
model
"deepseek/deepseek-v4-flash-20260423"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
509
moderation_latency
(null)
generation_time
12525
tokens_prompt
77
tokens_completion
952
native_tokens_prompt
67
native_tokens_completion
885
native_tokens_completion_images
(null)
native_tokens_reasoning
51
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.00025718
router
(null)
provider_responses
0
endpoint_id
"722e0746-2034-4d6e-a08f-e6271849ee9e"
id
"9024ef92-0bdf-4a34-8303-5176c8fbfde2"
is_byok
false
latency
278
model_permaslug
"deepseek/deepseek-v4-flash-20260423"
provider_name
"DeepSeek"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777128354-qdHhpBIFlt6PiWCtXOTh"
session_id
(null)
api_type
"completions"
id
"gen-1777128354-nPeLfvlIbBgqa3PBPLRu"
upstream_id
"9024ef92-0bdf-4a34-8303-5176c8fbfde2"
total_cost
0.00025718
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepSeek"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
1.04%
Matches word count
n/a
neededClean
false
words
526
2.60%
Dialogue to Total Word Ratio
Ratio: 78.54%, Deviation: 8.54%
neededClean
false
wordsTotal
536
wordsDialogue
421
1.8167%