NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-16-01KHM5M46QE2C7E22K0ACZM6KN
Run Details
Model:
DeepSeek V3.2
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
27.4s
Cost:
$0.000463
Provider:
SiliconFlow
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The back room of the ‘Happy Bean’ coffee shop smelled of stale grounds and desperation. Molly spread a hand-drawn map across the sticky table, anchoring one corner with a half-empty latte. Bill stared, his expression a mixture of disbelief and awe. “A children’s birthday party,” he said, flatly. “You dragged me here for a six-year-old’s birthday party.” “Not for the cake, Bill. For the loot.” Molly tapped the map with a chewed pencil. “The Vanderbilts. Their little angel, Percival, is turning seven. The gifts will be high-end. Educational robots, miniature stock portfolios, solid gold rattles—the works.” Bill rubbed his temples. “So we’re gonna what? Sneak past a bunch of sugared-up kids and a clown?” “Precisely. But with precision.” She pointed to a square labeled ‘PONY.’ “The main event is a pony ride in the backyard at 3:15 PM. Chaos ensues. All adults and children will be distracted. That’s our window.” “Our window to do what? Walk in the front door with a ‘Happy Birthday’ balloon and a sack?” “Basically, yes.” Molly smiled. “We pose as entertainers. You’ll be ‘Billy the Bumbling Balloon Artist.’ I’ll be ‘Molly the Magician’s Assistant,’ with a very large trick box for my ‘props.’ We arrive late, during the pony frenzy, slip inside to the gift table, fill the box, and exit stage left.” Bill blinked slowly. “A balloon artist. What if someone asks for a poodle? I can only make a snake. A sad, limp snake.” “Then make sad, limp snakes! The point is the disguise. The box is lined with sound-proofing foam. No jingle-jangle from the stolen blocks and designer teddy bears.” “What about the actual magician? Won’t he be suspicious of an assistant he didn’t hire?” “I told him I was sent by the agency. He was too busy practicing his dove vanishing act to care. His name is Marvin. He has a nervous tic. He’ll be fine.” Bill leaned back, crossing his arms. “And the extraction? We just waltz out with a box full of presents?” “We complain about a faulty smoke machine creating a haze, say we need air. We load the box into the van, which will be parked in the ‘Catering’ spot I’ve already forged a permit for.” “This is the most undignified job we’ve ever taken.” “It’s low security, high reward. No lasers, no guards with attack dogs. Just juice boxes and emotional parents.” Molly’s eyes gleamed. “Think of it, Bill. One hour’s work for a haul of untouched, resellable luxury toys. The fence at the downtown market is already salivating.” Bill sighed, staring at the map’s crude drawing of a bounce house. “Fine. But if I have to twist a balloon sword for some snot-nosed tycoon-in-training, I’m keeping the sword. As a hazard bonus.” “Deal.” Molly rolled up the map. “Now, let’s go over your balloon repertoire. And for heaven’s sake, try to look like you enjoy children.” “I enjoy them from a distance. A long distance. Like, from another city.” “Just remember,” Molly said, standing up. “Smile, be clumsy with the balloons, and no one will suspect you’re a criminal mastermind.” “Mastermind? I’m planning to steal a toy rocket from a toddler. My ancestors are weeping.” “They’re weeping tears of joy, Bill. Tears of joy.”
created_at
"2026-02-16T21:28:06.635558+00:00"
2/16/2026, 9:28:06 PM
model
"deepseek/deepseek-v3.2-20251201"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
3070
moderation_latency
(null)
generation_time
27318
tokens_prompt
77
tokens_completion
1199
native_tokens_prompt
67
native_tokens_completion
1059
native_tokens_completion_images
(null)
native_tokens_reasoning
269
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.00046287
router
(null)
provider_responses
0
id
"019c685a12c0cf199c0b00ba2442a8ed"
status
200
is_byok
false
latency
3070
endpoint_id
"b39a588a-b338-43ba-b046-226b9e24afa8"
provider_name
"SiliconFlow"
model_permaslug
"deepseek/deepseek-v3.2-20251201"
api_type
"completions"
id
"gen-1771277258-P79lHupvzHMsSbU0w3Ia"
upstream_id
"019c685a12c0cf199c0b00ba2442a8ed"
total_cost
0.00046287
cache_discount
(null)
upstream_inference_cost
0
provider_name
"SiliconFlow"
Evaluation details
Result
Evaluator
Details
Meta Data
0.0000%
Matches word count
n/a
neededClean
false
words
551
0.9291%
Dialogue to Total Word Ratio
Ratio: 79.67%, Deviation: 9.67%
neededClean
false
wordsTotal
551
wordsDialogue
439
0.4646%