NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-16-01KHM63MN4QHJZ2V18V8E9QD87
Run Details
Model:
DeepSeek V3.1
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
23.2s
Cost:
$0.000877
Provider:
Novita
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The air in the dusty storage unit was thick with the smell of old tires and desperation. Molly, a woman with the sharp eyes of a magpie and a wardrobe exclusively comprised of black nylon, jabbed a finger at a child’s crayon drawing taped to a whiteboard. The drawing depicted a house with a lopsided roof and a stick-figure girl holding a balloon. “This,” Molly declared, “is the target. 7 Maple Drive. Saturday, 3 PM. A sixth birthday party for one Penelope ‘Penny’ Peterson.” Bill, a large man stuffed into a tracksuit, squinted at the schematic. “It’s a drawing of a house, Moll. A rubbish one. The chimney’s on the wrong side.” “It’s a symbolic representation, Bill. I did the recon myself, disguised as a lost dogwalker.” She tapped another sheet, a blurry photo of a suburban home. “The real layout is simpler. Front door, back door, patio doors leading to the garden where the main event will be.” “Right. So we’re nicking from a kiddie’s party.” Bill scratched his chin. “What’s the score? A couple of action figures and a half-eaten piece of cake?” “Think bigger, Bill. Penny is the only granddaughter of ‘Diamond’ Dan Peterson.” Molly’s eyes gleamed. “Retired now, but old Dan made his fortune in… dubious wholesale electronics. The family spoils her rotten. We’re not talking cheap plastic. We’re talking the good stuff. The kind of presents that have their own security tags.” Bill frowned. “So what’s the plan? We rock up with a clown car and a big net?” “Subtlety, Bill. It’s all about subtlety.” Molly pointed to a crudely drawn stick figure wearing a massive, puffy costume. “You will be ‘Bubbles the Panda,’ the party entertainer I booked using a stolen credit card and a fake email address.” Bill’s face fell. “You want me to be a panda?” “It’s a flawless disguise! You’ll be inside the perimeter. You’ll do a few magic tricks, blow up some balloon animals—“ “I can’t make balloon animals, Molly. The poodle always looks like a sick worm.” “Just twist the things, Bill! The point is, you’ll provide the distraction. While you’re leading a conga line or whatever pandas do, I’ll be the caterer.” She indicated another drawing of a stick figure carrying a tray. “Specifically, the piñata handler. My piñata will be extra-sturdy. Suspiciously sturdy. Because it won’t be full of sweets. It’ll be full of the loot.” Bill stared at her, his expression a mixture of awe and horror. “You’re going to smash a piñata full of stolen presents?” “No, you idiot. I’m going to *pretend* to smash it. I’ll create a diversion—a small, contained glitter explosion—and in the ensuing sparkly chaos, you, as Bubbles the Panda, will help me ‘clean up’ by swiftly loading the gift table’s contents into the hollow piñata. We then carry it right out the front door.” “A glitter explosion?” “Kids love glitter. It’s chaotic neutral.” Molly crossed her arms. “So? What do you think?” Bill looked from the crayon drawing to the photo of the bland suburban house, then down at his own substantial belly. “The panda suit… does it come in a larger size?” “I got you the XXL. The question is, are you in?” He sighed, a sound like a deflating air mattress. “Fine. But I’m not singing. Pandas don’t sing.” “Pandas are silent but deadly,” Molly said with a sharp smile. “Now, let’s go over your magic trick. The wand goes *behind* the ear, not up the nose.”
created_at
"2026-02-16T21:36:31.327497+00:00"
2/16/2026, 9:36:31 PM
model
"deepseek/deepseek-chat-v3.1"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1231
moderation_latency
(null)
generation_time
23059
tokens_prompt
95
tokens_completion
945
native_tokens_prompt
68
native_tokens_completion
859
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.00087736
router
(null)
provider_responses
0
id
"d4457002-a527-4cf8-8cd6-2d9ab4afca8e"
status
200
is_byok
false
latency
1231
endpoint_id
"715d05e1-9fa7-4ceb-834a-833f5e26a55a"
provider_name
"Novita"
model_permaslug
"deepseek/deepseek-chat-v3.1"
api_type
"completions"
id
"gen-1771277767-FEKGNLjDpDOHC687f5b9"
upstream_id
"d4457002-a527-4cf8-8cd6-2d9ab4afca8e"
total_cost
0.00087736
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Novita"
Evaluation details
Result
Evaluator
Details
Meta Data
0.0000%
Matches word count
n/a
neededClean
false
words
583
99.3534%
Dialogue to Total Word Ratio
Ratio: 69.64%, Deviation: 0.36%
neededClean
false
wordsTotal
583
wordsDialogue
406
49.6767%