NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-16-01KHM632FEJFMR234G3CAAZZT2
Run Details
Model:
DeepSeek V3.1
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
22.3s
Cost:
$0.001478
Provider:
Fireworks
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The rust-stained RV smelled of stale coffee and desperation. Molly stared at the crayon-smeared invitation taped to the laminated table as if it held the nuclear codes. “It’s a fortress,” she declared, tapping a fingernail on the address. “Seven-year-old’s birthday. Maximum security.” Bill, sharpening a lockpick with a disturbingly large file, didn’t look up. “Kids’ parties are soft targets. Chaos is our ally. It’s all juice-box bribes and distracted adults.” “Distracted adults who are black-belt judo mums and dads hopped up on sugar and the existential dread of planning a seven-year-old’s party. These people are vigilant, Bill. Their entire self-worth is tied to the success of this bouncy-castle fiasco.” “So we go in quiet. Under the radar.” Bill held the pick up to the dim light, inspecting his work. Molly snatched a grubby blueprint from a pile of takeaway containers. “The ‘radar’ is a woman named Brenda who runs the Neighbourhood Watch like a paramilitary operation. She’ll detect a rogue sausage roll from fifty paces. Our intel suggests the loot is stashed in the ‘Gift Corral,’ a heavily fortified area next to the cake station.” “We create a diversion.” Bill’s eyes lit up. “We unleash the piñata early. Nothing causes more anarchy than a sugar-crazed mob of small children attacking a papier-mâché donkey with a stick.” “Too obvious. They’ll lockdown the presents first. No, we need a precision strike.” Molly leaned forward, her voice dropping to a conspiratorial whisper. “We go in as the entertainment.” Bill stopped filing. “We are not clowns, Molls. There are lines. I have standards. They’re low, but they exist.” “Not clowns. Magicians. ‘Marvolo the Magnificent and his assistant, Bill.’ We get them all gathered round, oohing and aahing. You do your card trick—the one that looks like you’re just dropping them all on the floor.” “It’s a work in progress.” “During the grande finale, the Disappearing Rabbit trick, you ‘accidentally’ knock over the punch bowl. While Brenda and the other sentries are slip-sliding in a tidal wave of fruit punch, I use the distraction to empty the Gift Corral. We stash the haul in the magic box.” Bill considered this, a slow grin spreading across his face. “The magic box is just a cardboard box spray-painted black.” “It’s a masterpiece of misdirection! They’ll never expect us to put the real loot in the fake prop. We walk out under the mournful cries of children who never saw the rabbit appear. We’re heroes who failed at magic, not thieves who succeeded at larceny.” “What about the actual rabbit?” Bill asked. “For the trick.” Molly pointed a thumb towards the back of the RV, where a large, temperamental rabbit named Brutus was currently gnawing on the electrical wiring. “He’s our getaway driver. The least he can do is earn his keep. This is a good score, Bill. No lasers, no pressure plates. Just a herd of miniature humans and their gatekeepers. It’s a milk run.” Bill slid his lockpick into a small leather roll. “A milk run. With a high probability of glitter-related injuries.” “Glitter is the herpes of the craft world. It’s a calculated risk. So? You in?” Bill sighed, the sound of a man who had long since surrendered to the absurdity of his life. “Alright. But if any of those kids calls me a‘magic man,’ I’m grabbing the rabbit and leaving you to the judo mums.” “Deal,” Molly said, already reaching for the spray paint. “Now, go practice making a handkerchief disappear. And for heaven’s sake, try to look mystifying.”
created_at
"2026-02-16T21:36:11.563777+00:00"
2/16/2026, 9:36:11 PM
model
"deepseek/deepseek-chat-v3.1"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
810
moderation_latency
(null)
generation_time
22247
tokens_prompt
95
tokens_completion
972
native_tokens_prompt
68
native_tokens_completion
857
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.00147784
router
(null)
provider_responses
0
id
"13805438-3718-4e54-8365-0d584252cfa6"
status
200
is_byok
false
latency
810
endpoint_id
"0c00f0d2-8aad-4700-bd36-11cb9a717fde"
provider_name
"Fireworks"
model_permaslug
"deepseek/deepseek-chat-v3.1"
api_type
"completions"
id
"gen-1771277748-wjvhgLVrceDF8iewc51L"
upstream_id
"13805438-3718-4e54-8365-0d584252cfa6"
total_cost
0.00147784
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Fireworks"
Evaluation details
Result
Evaluator
Details
Meta Data
0.0000%
Matches word count
n/a
neededClean
false
words
600
0.0000%
Dialogue to Total Word Ratio
Ratio: 73.33%, Deviation: 23.33%
neededClean
false
wordsTotal
600
wordsDialogue
440
0.0000%