NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-20-01KHY8QKWZAC7K321RJSYTJ2RB
Run Details
Model:
Gemma 3 12B
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
10.1s
Cost:
$0.000415
Provider:
Cloudflare
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The aroma of stale donuts and desperation hung thick in Bill’s cluttered apartment. Molly perched on a wobbly stool, meticulously examining a miniature plastic pony. Bill paced, a nervous energy radiating off him like heat from a faulty radiator. “A pony, really, Molly? You’re analyzing the equestrian accessories of a five-year-old?” Molly didn’t look up. “It’s a reconnaissance mission, Bill. We need to understand the target demographic. This isn’t your average jewelry store. We’re dealing with glitter, sugar highs, and the potential for projectile cake.” “Right, right. Glitter. The bane of a professional thief’s existence. So, the party. Little Timmy Henderson’s fifth birthday. Held at the community hall. Theme: ‘Pirates and Princesses.’ You’ve confirmed the security?” “Minimal. One bored-looking teenager manning the door, mostly concerned with TikTok. He’ll be easily distracted. My concern is the sheer volume of…stuff.” Molly finally looked up, her expression grim. “I’ve seen photos, Bill. Mountains of presents. We’re talking Lego sets, remote-controlled cars, enough princess dresses to outfit a small nation.” Bill stopped pacing. “Mountains, you say? That’s…good. Very good. Think of the resale value! We could finally ditch this dump and get somewhere with, you know, actual plumbing.” “Don’t get ahead of yourself. We need a plan. A solid plan. We can’t just waltz in and start grabbing presents like we’re at a Black Friday sale.” “Okay, okay. Plan. Right. I was thinking…disguises. We go as clowns! Everyone loves clowns! They’ll be too busy laughing to notice us emptying the gift table.” Molly stared at him. “Clowns, Bill? Seriously? You think a horde of sugar-fueled five-year-olds are going to be charmed by clowns? They’ll be terrified! We’ll be chased out with plastic swords and demands for more cake.” “Alright, alright, bad idea. What about…pirates? We blend in! We’re pirates stealing treasure! It’s thematic!” “Bill, you’re grasping at straws. We need something subtle. Something…unassuming.” Molly tapped the plastic pony thoughtfully. “I’m thinking…parents. We dress as parents. Blend in with the crowd. Casual, unremarkable. Nobody suspects the unassuming parent.” “Dress as parents? But…I don’t *do* parents. I barely remember to water my cactus.” “You can fake it, Bill. Smile awkwardly, compliment the cake, ask about potty training. It’s an act. We’re actors, Bill! Professional thieves! We play roles.” “Potty training? Seriously? I’m going to have nightmares.” “Just think of the payoff, Bill. Enough presents to set us up for life. Think of the plumbing!” Bill shuddered. “Fine. Parents it is. But I’m drawing the line at balloons. I refuse to touch a balloon.” Molly sighed. “Deal. Now, let’s work on your ‘concerned parent’ face. It currently resembles a startled badger.”
created_at
"2026-02-20T19:34:23.401Z"
2/20/2026, 7:34:23 PM
model
"google/gemma-3-12b-it"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
438
moderation_latency
(null)
generation_time
9888
tokens_prompt
93
tokens_completion
795
native_tokens_prompt
75
native_tokens_completion
694
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.00041489
router
(null)
provider_responses
0
endpoint_id
"eb06dc92-5a16-47ec-a776-6ef956457c47"
is_byok
false
latency
54
model_permaslug
"google/gemma-3-12b-it"
provider_name
"DeepInfra"
status
429
1
endpoint_id
"08ebad49-bed1-4149-850c-d24e3c1d2084"
id
"id-1771616063810"
is_byok
false
latency
374
model_permaslug
"google/gemma-3-12b-it"
provider_name
"Cloudflare"
status
200
api_type
"completions"
id
"gen-1771616063-24lBy01UfMEr7AhvlL9p"
upstream_id
"id-1771616063810"
total_cost
0.00041489
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Cloudflare"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
441
0.00%
Dialogue to Total Word Ratio
Ratio: 84.81%, Deviation: 34.81%
neededClean
false
wordsTotal
441
wordsDialogue
374
0.0000%