NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-20-01KHYA5HZDRWX62R6955881EWZ
Run Details
Model:
Gemma 3 12B
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
24.3s
Cost:
$0.000098
Provider:
DeepInfra
Evaluation summary
Score
Evaluator
Details Link
27.17%
Matches word count
(details)
0.22%
Dialogue to Total Word Ratio
(details)
13.6907%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The aroma of cheap pizza and desperation hung thick in Molly’s tiny apartment. Scattered across the table were photographs: bouncy castles, grinning children, and a disconcerting number of perfectly wrapped presents. Bill, a man whose enthusiasm consistently outstripped his planning abilities, was meticulously constructing a cardboard replica of a clown. “It’s adorable, Bill, really. But does a miniature clown help us lift twenty-seven Barbie Dreamhouses?” Molly asked, pinching the bridge of her nose. “It’s a distraction, Molly! They’ll be mesmerized! The parents, the kids… everyone! I’m thinking a slow, mournful honk. Very pathetic. Draws the eye,” Bill explained, adding a disproportionately large red nose to his creation. “Pathetic isn't exactly stealthy, Bill. We’re supposed to be ninjas, remember? Stealthy, not sad clowns.” “But ninjas can be sad! They just hide it better! I’ve been watching a lot of documentaries.” Molly sighed. “Right. Documentaries. And have any of these documentaries covered how to disable a bouncy castle security system?” “Well, no. But I did learn that ninjas appreciate meticulously crafted miniature props!” Bill held up the clown proudly. “Bill, this party is for Tiffany Butterfield’s daughter’s fifth birthday. It’s at the Willow Creek Country Club. It’s… aggressive. There'll be security, a sugar-fueled mob of five-year-olds, and probably Tiffany Butterfield herself, who looks like she owns several small countries.” “Details, details. Think of the loot! Remote control cars, princess tiaras, educational toys that actually teach something… It’s a goldmine, Molly! A goldmine of plastic and glitter!” “A goldmine guarded by miniature overlords wielding juice boxes and demanding cake. Look, the plan is still the same. I’ll create a diversion. You, as our resident expert in… approximately nothing useful, will slip in and collect the presents. We meet at the designated lamppost, split the haul, and we’re out.” “But the clown! It's crucial! Imagine the chaos!” Bill pleaded. “Bill, the only chaos I want is the kind that involves us making a clean getaway, not the kind that involves you being chased by a swarm of five-year-olds armed with party favors.” “Okay, okay, no clown. But what about… a realistic-looking piñata? Filled with confetti? A sudden, dazzling explosion of paper? That's distracting, right?” Molly rubbed her temples. “A piñata? Bill, we’re stealing presents, not entertaining children. We need to be efficient, not festive.” “But festive is fun! And efficient can be… dull. Look, I’m just trying to bring some joy to this operation!” “The joy will come when we're safely counting our ill-gotten gains, Bill. And not when you’re explaining to the police why you were carrying a box of miniature plastic horses. Just… stick to the plan, please. Distraction, collection, getaway. Got it?” Bill slumped in his chair, deflated. "Got it. No clown. No piñata. Just me, you and... twenty-seven Barbie Dreamhouses." He paused. "Are you sure we can carry twenty-seven Barbie Dreamhouses?"
created_at
"2026-02-20T19:59:28.76Z"
2/20/2026, 7:59:28 PM
model
"google/gemma-3-12b-it"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
184
moderation_latency
(null)
generation_time
24221
tokens_prompt
93
tokens_completion
839
native_tokens_prompt
75
native_tokens_completion
729
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.00009777
router
(null)
provider_responses
0
endpoint_id
"eb06dc92-5a16-47ec-a776-6ef956457c47"
id
"chatcmpl-RJSql44tJkP6uOShZS41QycG"
is_byok
false
latency
59
model_permaslug
"google/gemma-3-12b-it"
provider_name
"DeepInfra"
status
200
api_type
"completions"
id
"gen-1771617568-2mVSuuzwskPxM335R9Js"
upstream_id
"chatcmpl-RJSql44tJkP6uOShZS41QycG"
total_cost
0.00009777
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepInfra"
Evaluation details
Result
Evaluator
Details
Meta Data
27.17%
Matches word count
n/a
neededClean
false
words
481
0.22%
Dialogue to Total Word Ratio
Ratio: 81.08%, Deviation: 11.08%
neededClean
false
wordsTotal
481
wordsDialogue
390
13.6907%