NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-16-01KHM5PKKRGCJA3HPQ30YK9V7V
Run Details
Model:
DeepSeek V3.1
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
55.1s
Cost:
$0.000651
Provider:
SambaNova
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The warehouse loft stank of stale coffee and failed ambitions. Dust motes danced in the single, slanting beam of light that cut across the large table. In the centre of that table, meticulously arranged with the gravity of a jewel heist, was the target: a large, brightly-coloured, hand-drawn map of a suburban backyard. Molly tapped a perfectly manicured finger on a square labelled ‘BOUNCE CASTLE’. “This is the primary choke point,” she stated, her voice low and serious. “The sugar-high will have peaked by 14:30 hours. The tiny, chaotic operatives will be converging on this structure. It’s the perfect diversion.” Bill stared at the map, his brow furrowed in a way that suggested immense physical strain. He was a man built for smashing windows, not navigating past a piñata. He pointed a thick, hesitant finger at a smaller, crudely drawn circle. “The… dog,” he said, the word sounding foreign and dangerous in his mouth. Molly sighed, a long-suffering sound she reserved exclusively for Bill’s profound lack of finesse. “That’s not a dog, Bill. That’s a garden gnome. The Schnauzer’s operational radius is limited to the sun porch, here.” She pointed to a different area. “Its weakness is those tiny, expensive, organic dog biscuits. A handful of those, and he’s neutralized. A non-factor.” “Right. The biscuits.” Bill shifted his weight, making the rickety wooden chair groan in protest. “And the… extraction? The loot’s gonna be bulky. We talking a full-size sleigh, or what?” “The vehicle is a non-standard, low-profile model. A red wagon.” Molly held up a photograph of a Radio Flyer. “High capacity, manoeuvrable, and it blends perfectly with the environment. Our exit is via the side gate, here, during the ‘Happy Birthday’ chorus. The acoustics will provide full audio cover.” Bill squinted, trying to keep up. “Right. The song. But what about the parents? The tall ones. They’re the real guards, aren’t they?” “The guardians are predictable,” Molly said, unfolding a second schematic—a timeline of the party’s events. “They operate on a strict schedule of nostalgia and slight regret. Phase One: Arrival chaos. Phase Two: Pizza. They will be immobilised by carbohydrates and small talk. Phase Three: Cake. This is our window. Their focus will be entirely on the cake, the candles, and preventing a seven-year-old named Aiden from putting his entire face in the frosting.” She leaned forward, her eyes gleaming with tactical brilliance. “While they sing, you create a minor disturbance near the punch bowl. A spilled cup. I will move to the present table, under the pretense of adjusting a bow. I’ll transfer the gifts to the wagon, which you will have prepositioned behind the hedge.” Bill puffed out his cheeks, absorbing the plan. “So… I’m the punch bowl guy?” “You are the crucial diversionary element,” Molly corrected, though her tone suggested ‘punch bowl guy’ was also acceptable. He nodded slowly, a grin spreading across his face. “I can do that. I can spill punch like a pro.” He looked at the colourful map, then back at Molly, a flicker of doubt returning. “You sure this is worth it? I mean, it’s a bunch of action figures and plastic jewellery.” Molly’s expression turned steely. “It’s the principle, Bill. The sheer, beautiful audacity of it. They’ll check their vaults, their museums, their casinos… but a child’s birthday party? The one place no one expects a heist. We’re not just thieves. We’re artists.” She looked at the drawing of the bounce castle, a predatory smile on her lips. “Now, let’s go over the cake protocol again. The frosting is a wild card we cannot underestimate.”
created_at
"2026-02-16T21:29:56.262527+00:00"
2/16/2026, 9:29:56 PM
model
"deepseek/deepseek-chat-v3.1"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1597
moderation_latency
(null)
generation_time
55064
tokens_prompt
95
tokens_completion
982
native_tokens_prompt
67
native_tokens_completion
855
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0006513
router
(null)
provider_responses
0
id
"da9633e9-731a-4ce2-86ea-8eb3abcd4a45"
status
200
is_byok
false
latency
1597
endpoint_id
"1dda4bb9-c626-4816-bb54-99665323fef0"
provider_name
"SambaNova"
model_permaslug
"deepseek/deepseek-chat-v3.1"
api_type
"completions"
id
"gen-1771277340-8T7uePoqLP5Ct4G0pWDT"
upstream_id
"da9633e9-731a-4ce2-86ea-8eb3abcd4a45"
total_cost
0.0006513
cache_discount
(null)
upstream_inference_cost
0
provider_name
"SambaNova"
Evaluation details
Result
Evaluator
Details
Meta Data
0.0000%
Matches word count
n/a
neededClean
false
words
608
0.0000%
Dialogue to Total Word Ratio
Ratio: 58.88%, Deviation: 28.88%
neededClean
false
wordsTotal
608
wordsDialogue
358
0.0000%