NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-16-01KHM5FKWZG7C4WWRQK0XMS4KM
Run Details
Model:
DeepSeek V3 (2025-03-24)
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
11.6s
Cost:
$0.000669
Provider:
SiliconFlow
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The storage unit smelled of dust and old Christmas decorations, which seemed fitting given the plan. Molly adjusted her ski mask—pink, because why not—and leaned over the blueprint of the McAllister estate. Bill, his own mask crooked over one ear, squinted at the paper like it had personally offended him. "This is a kid’s birthday party, not Fort Knox," Molly said, tapping the drawing. "We just need to grab the loot and go." Bill snorted. "Tell that to the three-foot-tall security team armed with juice boxes and sticky fingers. Kids notice *everything*." He jabbed a finger at the layout. "We gotta time it right. Pin the tail on the donkey? Distraction. Piñata? Chaos. Cake? That’s our window." Molly rolled her eyes. "You’re overcomplicating it. We dress as clowns, blend in, and stuff the presents into oversized balloon animals." "Clowns?!" Bill recoiled. "You want us to *voluntarily* wear clown shoes? Those things are death traps. I’d rather wrestle the family dog." "Fine. No shoes. But the clown makeup stays—it’s the perfect disguise." Bill groaned, rubbing his temples. "Remind me why we’re robbing a six-year-old’s party again?" Molly grinned. "Because nobody guards the good stuff at a kid’s party. iPads, designer toys, cold hard cash from rich aunties—all sitting under a sad banner that says ‘Happy Birthday, Timmy.’" "That’s messed up," Bill said. Then, after a pause, "What if Timmy cries?" "You’re getting soft." "I’m not soft. I’m *professionally considerate*." Molly sighed. "Look, we leave behind the stuffed animals and crayons. Fair?" Bill nodded. "That’s acceptable banditry." "Good. Now, here’s the real problem," she said, pointing to a scribbled note. "McAllister’s got a *magician* booked. Saw him at another gig—guy’s a menace with disappearing acts." "So?" "So what if he makes *us* disappear mid-heist?" Bill blinked. "That’s… actually a valid concern." "Thank you." They stared at the blueprint in silence. Then Bill smirked. "We turn it on him. Swap his deck with marked cards. When he pulls a ‘random’ volunteer, it’s *you*. Then, during the big trick—" "—I ‘vanish’ with the presents," Molly finished. "Genius. And when the cops ask, we were just part of the show." "See? I’m not just a pretty face in a crooked mask." Molly smirked. "Alright, clown. Let’s go ruin a kid’s birthday." Bill saluted with a whoopee cushion he’d pulled from nowhere. The heist was on.
created_at
"2026-02-16T21:25:23.624074+00:00"
2/16/2026, 9:25:23 PM
model
"deepseek/deepseek-chat-v3-0324"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1416
moderation_latency
(null)
generation_time
11477
tokens_prompt
77
tokens_completion
681
native_tokens_prompt
66
native_tokens_completion
653
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0006695
router
(null)
provider_responses
0
id
"019c6857d0709fcc474604180685b1a0"
status
200
is_byok
false
latency
1416
endpoint_id
"d562908f-c5eb-4a32-9a91-24d53b176fd8"
provider_name
"SiliconFlow"
model_permaslug
"deepseek/deepseek-chat-v3-0324"
api_type
"completions"
id
"gen-1771277111-A8IeSJTZC9zDJT8IjgB8"
upstream_id
"019c6857d0709fcc474604180685b1a0"
total_cost
0.0006695
cache_discount
(null)
upstream_inference_cost
0
provider_name
"SiliconFlow"
Evaluation details
Result
Evaluator
Details
Meta Data
0.0000%
Matches word count
n/a
neededClean
false
words
401
0.0000%
Dialogue to Total Word Ratio
Ratio: 68.08%, Deviation: 18.08%
neededClean
false
wordsTotal
401
wordsDialogue
273
0.0000%