NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-16-01KHM5T6N7P513PEKS4BY85MN0
Run Details
Model:
DeepSeek V3 (2024-12-26)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
24.1s
Cost:
$0.000923
Provider:
Novita
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The basement of Molly’s apartment smelled like stale pizza and bad life choices. Bill sat on a folding chair, hunched over a hand-drawn map of the neighborhood, while Molly paced, her boots scuffing against the concrete floor. “Alright, genius,” Molly said, tossing a crumpled party invitation onto the table. “How do we pull this off without getting tackled by a bunch of sugar-crazed six-year-olds?” Bill squinted at the map, tapping a finger on a poorly drawn stick-figure house. “Simple. We go in as clowns. No one questions clowns.” Molly stopped pacing. “Clowns? You’re kidding.” “Dead serious. Clowns are like ninjas at kids’ parties. Invisible menace.” “Bill, I once saw you cry because a pigeon stole your sandwich. You’re not clown material.” He waved her off. “Irrelevant. The point is, we blend in, grab the loot, and bounce before anyone notices.” Molly sighed and rubbed her temples. “Fine. But if one of those little gremlins asks you to twist a balloon animal, we’re screwed.” Bill grinned. “Already thought of that. I’ll just hand them a deflated balloon and say, ‘It’s modern art.’” Molly groaned. “This is why we don’t get invited to parties.” Bill ignored her, unfolding a second sheet of paper—a crude diagram of the target house. “Okay, so presents will be piled near the cake. We’ll need a distraction.” “What kind?” “I was thinking… a piñata emergency.” Molly stared. “A what?” “We rig the piñata to explode. Kids lose their minds, parents panic, and while everyone’s screaming, we grab the goods.” “That’s the dumbest thing I’ve ever heard.” “It’s genius!” “It’s arson.” Bill scoffed. “It’s *controlled* chaos.” Molly pinched the bridge of her nose. “We are *not* setting a piñata on fire.” “Fine, fine. Backup plan: we hire a pony.” “A pony.” “Yeah. Kids love ponies. We show up with one, cause a stampede, and—” “Where, exactly, are we getting a pony?” Bill hesitated. “...Petting zoo?” Molly threw her hands up. “I’m surrounded by madness.” “Look, do you have a better idea?” She paused, then smirked. “Actually, yeah. We go as magicians.” Bill blinked. “Magicians?” “Yeah. ‘Pick a present, any present!’ And then—poof—they’re all gone.” Bill considered it. “That’s… not terrible.” “I know. And if we get caught, we just say it’s part of the act.” Bill grinned. “Molly, you devious genius.” She shrugged. “Just don’t screw up the disappearing part.” “No promises.” The two clinked their coffee mugs together in a toast. The heist was on.
created_at
"2026-02-16T21:31:22.970927+00:00"
2/16/2026, 9:31:22 PM
model
"deepseek/deepseek-chat-v3"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1210
moderation_latency
(null)
generation_time
24033
tokens_prompt
77
tokens_completion
775
native_tokens_prompt
66
native_tokens_completion
690
native_tokens_completion_images
0
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0009234
router
(null)
provider_responses
0
id
"18390f69f7469e5120cfbd0d946282d1"
status
200
is_byok
false
latency
1210
endpoint_id
"cefb5a1e-9fea-4496-9a64-4e1b11a7cc8c"
provider_name
"Novita"
model_permaslug
"deepseek/deepseek-chat-v3"
api_type
"completions"
id
"gen-1771277458-nd1JX417QtvlvN1c303c"
upstream_id
"18390f69f7469e5120cfbd0d946282d1"
total_cost
0.0009234
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Novita"
Evaluation details
Result
Evaluator
Details
Meta Data
0.0000%
Matches word count
n/a
neededClean
false
words
416
0.0000%
Dialogue to Total Word Ratio
Ratio: 66.11%, Deviation: 36.11%
neededClean
false
wordsTotal
416
wordsDialogue
275
0.0000%