NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-20-01KHY9QZ7T91TW38VS61MWR2YQ
Run Details
Model:
Gemma 3 12B
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
11.9s
Cost:
$0.000412
Provider:
Cloudflare
Evaluation summary
Score
Evaluator
Details Link
1.04%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.5180%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The community hall smelled overwhelmingly of sugar and desperation. Molly wrinkled her nose, surveying the scene through the grimy window. Streamers drooped like defeated flags, a bouncy castle deflated in a corner, and a gaggle of sugar-fueled children shrieked with the manic energy of tiny, glitter-covered tornadoes. “Right,” she said, turning to Bill, who was meticulously polishing a pair of novelty Groucho Marx glasses. “Operation Sugar Rush is a go. You still think this is a good idea?” Bill didn't look up. “Molly, darling, we’re stealing presents from children. It’s inherently a *fantastic* idea. Think of the chaos! The headlines! ‘Ruthless Gift-Grabbers Terrorize Tots!’ It’s gold, I tell you, gold!” He finally peered over the glasses, his eyes gleaming with a disturbing enthusiasm. “Besides, the intel was solid. Mrs. Higgins, the party planner, bragged about the ‘unprecedented volume’ of gifts. Apparently, little Timmy Butterfield’s birthday is a *thing*.” Molly sighed. “Timmy Butterfield. Sounds like a menace. And unprecedented volume? That could mean anything from a single pony to a mountain of plastic dinosaurs.” “We’ll deal with the dinosaurs,” Bill declared, snapping the glasses onto his face. “My plan is simple. You create a diversion. Something… memorable. I’ll slip in, gather the loot, and we’re out before anyone notices a single missing My Little Pony.” “Memorable? Bill, I’m not a clown. I’m a professional.” “A professional who’s been unemployed for three weeks, Molly. Let’s not quibble over semantics. Think… a rogue balloon animal? A sudden, inexplicable outbreak of interpretive dance? Something to draw their attention away from the present table.” Molly considered this, tapping a fingernail against her chin. “Interpretive dance? Really? You think a room full of five-year-olds are going to be distracted by my rendition of ‘The Lament of the Lost Sock’?” Bill shrugged. “It’s a starting point. Look, the key is misdirection. I’ll need at least five minutes, uninterrupted. Can you manage that?” “Five minutes? In a room full of sugar-fueled chaos? I can probably manage five seconds before someone tries to stick glitter on my face.” She glanced at the bouncy castle again. “Maybe I could… deflate the bouncy castle? That would cause a stir.” “Excellent! A strategic deflation! Brilliant, Molly, truly brilliant! Just… try not to get tangled in the nylon. We don’t need you ending up as a human-sized bouncy ball.” He adjusted his Groucho glasses. “Right, I’m going to scout the perimeter. Locate the present table. Assess the dinosaur situation. Wish me luck.” He slipped out the door, leaving Molly staring at the scene with a mixture of apprehension and reluctant amusement. “Luck?” she muttered to herself. “I’ll need more than luck. I’ll need a hazmat suit and a very strong cup of coffee.” She took a deep breath and began formulating a plan. Interpretive dance was definitely off the table.
created_at
"2026-02-20T19:52:03.588Z"
2/20/2026, 7:52:03 PM
model
"google/gemma-3-12b-it"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
173
moderation_latency
(null)
generation_time
11828
tokens_prompt
93
tokens_completion
806
native_tokens_prompt
75
native_tokens_completion
689
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.00041209
router
(null)
provider_responses
0
endpoint_id
"08ebad49-bed1-4149-850c-d24e3c1d2084"
id
"id-1771617123699"
is_byok
false
latency
108
model_permaslug
"google/gemma-3-12b-it"
provider_name
"Cloudflare"
status
200
api_type
"completions"
id
"gen-1771617123-Q59chFaTJzAaKux6eQBt"
upstream_id
"id-1771617123699"
total_cost
0.00041209
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Cloudflare"
Evaluation details
Result
Evaluator
Details
Meta Data
1.04%
Matches word count
n/a
neededClean
false
words
474
0.00%
Dialogue to Total Word Ratio
Ratio: 67.30%, Deviation: 37.30%
neededClean
false
wordsTotal
474
wordsDialogue
319
0.5180%