NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-25-01KQ2EEPRT3SERYQZ5904JXSEB
Run Details
Model:
DeepSeek V4 Flash (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
12.7s
Cost:
$0.000245
Provider:
DeepSeek
Evaluation summary
Score
Evaluator
Details Link
90.48%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
45.2419%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
“I’m just saying, Bill, the logistics are a nightmare.” Molly spread the party invitation across the greasy diner table. It was crumpled and stained with ketchup, but the embossed gold lettering still gleamed. “Clowns. Bouncy castle. Balloon animals. And a five-year-old named Bartholomew.” Bill squinted. “Bartholomew? That’s a heist name if I ever heard one.” “Focus. The presents are in the ‘gift grotto’—a tent in the backyard. Parents drop them off at the door. We need a distraction.” “I can do a distraction.” Bill puffed up his chest. “I’ll pretend to be the birthday clown.” “You’re terrified of children.” “I’m not terrified. I’m… strategically avoidant. They’re sticky. And loud. And they ask questions.” “Like what?” “‘Why is your nose bleeding?’” Molly pinched the bridge of her own nose. “We need a clean in-and-out. I’ll go as the balloon animal lady. You’ll be my assistant.” “What does an assistant do?” “Hold the balloons. Look busy. Don’t make eye contact with any child longer than three seconds.” Bill nodded slowly. “I can do that. But what about the unicorn piñata?” Molly’s eyes narrowed. “What about it?” “It’s full of candy. Real candy. Not the cheap stuff. Miniature chocolate bars.” “We’re here for the presents, Bill. The presents. Wrapped boxes. Possibly containing gift cards or small electronics.” “Or candy,” Bill muttered. “No. The piñata is a side hustle. We stick to the plan. Ready? I’ll distract the parents with a balloon sword, you use the magician’s disappearing cloak—that I borrowed from my cousin—to scoop the gifts into the duffel bag.” “I don’t think the cloak actually makes you invisible. It’s just a black sheet with a picture of a sad wizard on it.” “It’s the principle of the thing. People don’t look at sad wizards. They look away out of pity. By the time they look back, we’re gone.” Bill traced the edge of the invitation with his finger. “And if a child sees me?” “Then you say the magic words.” “Which are?” “‘Would you like a balloon sword?’ and then you hand them one. No one suspects the balloon sword guy.” “Balloon sword gal. I’m the assistant.” “Fine. Balloon sword gal. You have a point. But what if Bartholomew recognizes me?” Molly sighed. “You’ve never met Bartholomew.” “He might have seen my face on a wanted poster. I have a certain look.” “You have a certain look of a man who hasn’t slept in three days because he’s been planning a heist on a kindergartener’s birthday party. Now, are you in or out?” Bill stared at the invitation. The ketchup stain had begun to look like a tiny clown face. “In. But I’m keeping the piñata candy.” “Fine. But if you get caught, I’m telling the police you’re the balloon animal lady.” “Deal.” He held out his hand. She shook it. “Now,” Molly said, pulling out a parking lot map, “we case the backyard at 3:47 PM precisely. The parents will be drunk on fruit punch by then.” “I love fruit punch.” “That’s the spirit. Let’s go steal some dolls.”
created_at
"2026-04-25T13:49:49.726Z"
4/25/2026, 1:49:49 PM
model
"deepseek/deepseek-v4-flash-20260423"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
532
moderation_latency
(null)
generation_time
12582
tokens_prompt
77
tokens_completion
947
native_tokens_prompt
67
native_tokens_completion
840
native_tokens_completion_images
(null)
native_tokens_reasoning
59
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.00024458
router
(null)
provider_responses
0
endpoint_id
"722e0746-2034-4d6e-a08f-e6271849ee9e"
id
"33f5926f-a533-45c1-a659-875a894753b8"
is_byok
false
latency
219
model_permaslug
"deepseek/deepseek-v4-flash-20260423"
provider_name
"DeepSeek"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777124989-TWfSUlPjNEfXiV3bG7cm"
session_id
(null)
api_type
"completions"
id
"gen-1777124989-wlPw9rwMw8GSLWP1QCPe"
upstream_id
"33f5926f-a533-45c1-a659-875a894753b8"
total_cost
0.00024458
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepSeek"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
90.48%
Matches word count
n/a
neededClean
false
words
510
0.00%
Dialogue to Total Word Ratio
Ratio: 82.10%, Deviation: 32.10%
neededClean
false
wordsTotal
514
wordsDialogue
422
45.2419%