NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-16-01KHM5NBGXC0EJN946TGTPWBX8
Run Details
Model:
DeepSeek V3.1
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
11.1s
Cost:
$0.001312
Provider:
Fireworks
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The warehouse smelled of damp concrete and failed ambitions. Molly, perched on a crate marked ‘FRAGILE – GLORIOUS DINO-MITE PARTY SUPPLIES’, chewed on the end of a pencil. Bill paced nervously, the beam of his flashlight cutting frantic arcs across the dusty floor. “Okay,” Molly said, tapping a crudely drawn blueprint. “The target is the ‘Giggles & Wiggles Mega-Birthday Blowout’ for one Timmy Henderson, age six. Objective: liberate the entirety of the gift table.” “Liberate?” Bill frowned. “We’re stealing from a kid, Moll. A little kid. With a cake shaped like a cartoon badger.” “We are reallocating resources from an over-indulged bourgeoisie six-year-old to two under-capitalized entrepreneurs. Us.” She pointed the pencil at him. “Now, the party is at a soft-play arena. Chaos is our friend. The primary security is a single, harried mother named Brenda who will be approximately 72% wine by the time we arrive. The secondary security is a pack of sugared-up five-year-olds, which is a more formidable threat, I’ll admit.” Bill stopped pacing and ran a hand over his face. “How do we even get in? We don’t exactly blend. I’ve got a tattoo of a weeping clown on my neck.” “We’re going in as staff. Specifically, the party’s special guests: Captain Sparklebeard and Seaman Squeaky, the pirate mascots.” Bill stared at her. The silence stretched. A single drop of water plinked somewhere in the darkness. “You want me,” he said slowly, “a man with a chronic lower back issue and a deep-seated fear of synthetic fabrics, to wear a giant pirate costume.” “The beard is detachable. Look, Bill, it’s genius. We’re invisible. Who looks at the mascot? Nobody. Parents are terrified of them, kids just scream and try to pull their heads off. We can move freely through the venue. While I’m doing the mandatory pirate jig, you’ll be on ‘crowd control’, which is code for stuffing plush toys and gift cards into our specially modified loot-sacks.” “Loot-sacks?” “I sewed pockets into the costumes. Big ones.” Bill sighed, a sound of profound resignation. “The exit strategy?” “Standard distraction. At the designated time, you’ll ‘accidentally’ trigger the bubble machine and the confetti cannon simultaneously. In the ensuing sensory overload, we simply waddle out the emergency exit to the getaway vehicle.” “Which is?” “My cousin’s ice cream truck. It’s perfect cover. We load the goods in the back with the Neapolitan and we’re gone. We’ll be eating premium steak dinners paid for by little Timmy’s new bike before he’s even finished his second slice of badger-cake.” Bill was quiet for a long moment, contemplating the sheer, ridiculous audacity of it all. He looked at the blueprint, at the drawing of a stick-figure pirate next to a giant ‘X’. He thought about the weeping clown tattoo and the impending synthetic fabric nightmare. “Alright,” he groaned. “But if one single child asks me for a hug, I’m aborting the mission. I have boundaries.” “Noted. No hugging the merchandise.” Molly grinned, a flash of white in the dim light. “Now, let’s go over the pirate jig again. It’s crucial you get the hip-swivel right.”
created_at
"2026-02-16T21:28:30.660299+00:00"
2/16/2026, 9:28:30 PM
model
"deepseek/deepseek-chat-v3.1"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
416
moderation_latency
(null)
generation_time
11014
tokens_prompt
95
tokens_completion
865
native_tokens_prompt
68
native_tokens_completion
759
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
3
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.00131236
router
(null)
provider_responses
0
id
"a5fe431d-58a3-4eba-9b34-8319264d650a"
status
200
is_byok
false
latency
416
endpoint_id
"0c00f0d2-8aad-4700-bd36-11cb9a717fde"
provider_name
"Fireworks"
model_permaslug
"deepseek/deepseek-chat-v3.1"
api_type
"completions"
id
"gen-1771277299-3kgGyaNBTvlKvvOZXCT6"
upstream_id
"a5fe431d-58a3-4eba-9b34-8319264d650a"
total_cost
0.00131236
cache_discount
8.4e-7
upstream_inference_cost
0
provider_name
"Fireworks"
Evaluation details
Result
Evaluator
Details
Meta Data
0.2141%
Matches word count
n/a
neededClean
false
words
528
0.0000%
Dialogue to Total Word Ratio
Ratio: 73.67%, Deviation: 23.67%
neededClean
false
wordsTotal
528
wordsDialogue
389
0.1070%