NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-16-01KHM5Q0B75EGSA4SVSHA4899K
Run Details
Model:
DeepSeek V3.1
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
14.5s
Cost:
$0.001301
Provider:
Fireworks
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The back room of O’Malley’s Pub smelled of stale beer and defeated dreams, a perfect place to plan a felony. Molly, her fiery red hair tied in a messy bun, stabbed a finger at a child’s crayon-drawn map taped to the sticky table. “The target is the Rainbow Room at the ‘Happy Land’ community center,” she stated, her voice a low, serious murmur. “The mark is one ‘Timmy,’ age seven. His parents have more money than sense and a documented weakness for overcompensating with extravagant gifts.” Bill, a mountain of a man wearing a tracksuit two sizes too small, squinted at the drawing. “It’s a kid’s birthday party, Moll. The security’s gonna be a five-year-old with a water pistol and a strict ‘no seconds on cake’ policy.” “Precisely,” Molly said, a gleam in her eye. “They’ll never see us coming. Our point of entry is here,” she pointed to a wobbly rectangle labeled ‘KICHEN.’ “We go in disguised as the hired entertainment.” Bill’s face fell. “What kind of entertainment?” “I’m ‘Sparklehoof the Unicorn,’ and you’re my handler.” She slid a garish, glitter-covered horn headband across the table. Bill stared at it as if it were a live scorpion. “Handler? What does a unicorn handler do?” “You carry the glitter cannon and manage my ‘equine sensitivities.’ It’s all in the legend I fabricated.” She tapped a meticulously forged letter of agreement from ‘Sparklehoof’s Magical Creature Rentals.’ “A glitter cannon?” Bill’s voice trembled. “Molly, last time there was glitter involved, I was picking it out of my… well, everywhere… for a month. The missus thought I’d joined a disco cult.” “It’s non-toxic and biodegradable, Bill. Focus. The op is simple. We enter during the peak chaos of the sugar rush, around 2:45 PM. You create a diversion with the glitter cannon—” “A diversion from a unicorn? What’s more diverting than a bleeding unicorn in a community centre?” “Just point the cannon at the piñata and fire. While the little monsters are blinded by festive particulates, I’ll use my prehensile unicorn tail—” she held up a fuzzy pink tail with a clever clasp on the end “—to hoover the presents from the gift table and into my saddlebags.” Bill rubbed his temples. “Prehensile tail. Right. And the extraction?” “We exfiltrate through the same door. We’ll be just another pair of traumatized party vendors fleeing the screeching horde. We’ll meet at the secondary location to divvy up the haul.” “The secondary location being my mum’s garage?” “It’s a secure, neutral site, Bill.” He sighed, a sound of deep, profound resignation. He picked up the unicorn horn, his massive fingers looking absurdly large against the pink sequins. “So let me get this straight. Our entire plan, our big score, hinges on me, a two-hundred-and-fifty-pound man, being a convincing unicorn handler.” Molly leaned forward, her expression dead serious. “It’s not about being convincing, Bill. It’s about commitment to the role. Now, practice your look of concerned reverence for a mythical beast. And for heaven’s sake, try not to step on any toddlers.”
created_at
"2026-02-16T21:29:28.152769+00:00"
2/16/2026, 9:29:28 PM
model
"deepseek/deepseek-chat-v3.1"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
423
moderation_latency
(null)
generation_time
14053
tokens_prompt
95
tokens_completion
856
native_tokens_prompt
68
native_tokens_completion
753
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
6
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.00130144
router
(null)
provider_responses
0
id
"239fb9ba-2163-47c7-bd79-ff364aff66e6"
status
200
is_byok
false
latency
423
endpoint_id
"0c00f0d2-8aad-4700-bd36-11cb9a717fde"
provider_name
"Fireworks"
model_permaslug
"deepseek/deepseek-chat-v3.1"
api_type
"completions"
id
"gen-1771277353-3im4nRZPNfy1bqA4cEsN"
upstream_id
"239fb9ba-2163-47c7-bd79-ff364aff66e6"
total_cost
0.00130144
cache_discount
0.00000168
upstream_inference_cost
0
provider_name
"Fireworks"
Evaluation details
Result
Evaluator
Details
Meta Data
60.2752%
Matches word count
n/a
neededClean
false
words
515
0.0003%
Dialogue to Total Word Ratio
Ratio: 66.02%, Deviation: 16.02%
neededClean
false
wordsTotal
515
wordsDialogue
340
30.1377%