NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-16-01KHM5G1E5BQYQC1NJHJJX27ZW
Run Details
Model:
DeepSeek V3 (2025-03-24)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
3.7s
Cost:
$0.000570
Provider:
BaseTen
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The dim glow of a flickering streetlight cast long shadows across the alley where Molly and Bill huddled over a crumpled napkin map. The scent of stale pizza and wet pavement clung to the air, mingling with the faint sound of distant laughter from the party they were about to ruin. Molly tapped the napkin with a chipped fingernail. "You’re telling me the best way in is through the *bouncy castle*?" Bill grinned, his gold tooth catching the light. "Think about it—no one expects a full-grown adult to army-crawl through a tunnel made for sugar-crazed six-year-olds. It’s genius." Molly sighed, rubbing her temples. "It’s *something*, alright. What’s Plan B? Because I’m not getting stuck in a bouncy castle while some kid screams about a ‘giant rat’ in the fun zone." Bill folded his arms. "Fine. We go in through the back—past the clown." Molly’s eyebrow twitched. "The clown." "The clown," Bill confirmed. "He’s distracted making balloon poodles. We slip by, grab the presents, and vanish before anyone notices." Molly stared at him. "You’re allergic to latex." Bill’s grin faltered. "I—what?" "The balloons. You’ll sneeze. Loudly." Bill waved a hand. "I’ll hold my breath." "Last time you held your breath, you passed out in a bank vault and we had to drag you out in a laundry cart." "That was one time!" Molly pinched the bridge of her nose. "Okay, new plan. We *don’t* rely on your terrible ideas. We dress as caterers, walk in like we own the place, and load up the presents while everyone’s distracted by cake." Bill frowned. "What’s wrong with my ideas?" "Besides the fact they all involve you wheezing or getting stuck in inflatable structures?" Bill crossed his arms. "I feel attacked." Molly ignored him, pulling a clipboard from her bag. "We fake an invoice, say we’re the ‘gift relocation team,’ and walk out with everything. Clean. Simple. No clowns, no bouncy castles." Bill scratched his chin. "What if someone asks why we’re relocating the gifts?" Molly smirked. "We tell them Santa’s running quality control checks." Bill blinked. "That’s ridiculous." "So’s your face, but here we are." A beat of silence. Then Bill sighed. "Fine. But if this goes south, I’m blaming you *and* jumping into the bouncy castle on my way out." Molly rolled her eyes. "Deal. Now grab the clipboard and try to look like you’ve ever held a real job." Bill straightened his shirt, then immediately wrinkled it again. "No promises." The distant sound of children shrieking with joy carried on the wind as the two thieves strode toward the party, ready to ruin some birthdays.
created_at
"2026-02-16T21:25:29.363242+00:00"
2/16/2026, 9:25:29 PM
model
"deepseek/deepseek-chat-v3-0324"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
215
moderation_latency
(null)
generation_time
3654
tokens_prompt
77
tokens_completion
721
native_tokens_prompt
69
native_tokens_completion
671
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0005698
router
(null)
provider_responses
0
id
"chatcmpl-cf5c5aa5f21749d6972791a68185d120"
status
200
is_byok
false
latency
215
endpoint_id
"9dd0a439-1f2d-4bf7-ac59-604a0bfa16e3"
provider_name
"BaseTen"
model_permaslug
"deepseek/deepseek-chat-v3-0324"
api_type
"completions"
id
"gen-1771277125-xyIE3hyy30hNmAKXl3tX"
upstream_id
"chatcmpl-cf5c5aa5f21749d6972791a68185d120"
total_cost
0.0005698
cache_discount
(null)
upstream_inference_cost
0
provider_name
"BaseTen"
Evaluation details
Result
Evaluator
Details
Meta Data
0.0000%
Matches word count
n/a
neededClean
false
words
442
0.0000%
Dialogue to Total Word Ratio
Ratio: 62.22%, Deviation: 32.22%
neededClean
false
wordsTotal
442
wordsDialogue
275
0.0000%