NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-20-01KHY9XA5DGRXKTCGG93SGFHDF
Run Details
Model:
Gemma 3 12B
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
13.7s
Cost:
$0.000093
Provider:
DeepInfra
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The aroma of sugar and desperation hung thick in Molly’s tiny apartment. Cardboard boxes overflowed with disguises – clown noses, superhero capes, a disturbingly realistic inflatable dinosaur. Bill, perched precariously on a stack of those boxes, was attempting to wrestle a particularly stubborn feather boa onto a stuffed badger. “Honestly, Bill, a badger in a boa? Are we targeting eight-year-olds or a theatrical production of ‘The Wind in the Willows’?” Molly asked, meticulously sketching a floor plan of ‘Bartholomew’s Bouncy Bonanza’ – the theme of the party we were about to plunder. Bill huffed, the boa finally succumbing to his efforts. "Look, it’s misdirection! They'll be too busy wondering why there’s a badger in a boa to notice us systematically *acquiring* the loot." “Acquiring,” Molly repeated dryly. “That’s a polite word for stealing.” She tapped the floor plan with a bright pink marker. “Okay, so Bartholomew’s parents, the ridiculously wealthy Vandergelt family, are throwing this extravaganza at the Willow Creek Community Hall. Three hundred kids. That’s three hundred potential gift opportunities.” Bill preened, admiring the badger's flamboyant new accessory. “See? Genius! And the badger’s my muscle.” “The badger is a stuffed animal, Bill,” Molly pointed out. “Our muscle is… well, you, I guess.” He bristled slightly. “I’m surprisingly strong for a man who enjoys competitive crochet.” "Right, and I'm surprisingly good at infiltrating children's parties,” Molly retorted, rolling her eyes. “The security’s lax. A couple of bored teenagers mostly, armed with whistles and a vague sense of responsibility. We go in disguised as entertainers. You’re a balloon artist, I’m a face painter. We move systematically through the crowd, palming gifts, and meet back at the emergency exit in twenty minutes. Got it?" Bill nodded enthusiastically, nearly knocking the badger off its perch. “Twenty minutes! What about the bouncy castle? Are we raiding *that* for inflatable valuables?” “No, Bill. No raiding the bouncy castle. We’re after the shiny, wrapped things. Sparkly bows are key. Minimise interaction. Maximum gift acquisition. And most importantly, no badger-related incidents." Molly paused, scrutinizing the floor plan again. “The Vandergelts are particularly fond of limited-edition robot trains. Those are priority one. Lots of resale value.” “Robot trains, got it!” Bill chirped, already attempting to twist a stray balloon into the vague shape of a locomotive. "So, I get to say, 'Have a balloon, little one!' and you get to ask, 'Would you like a butterfly or a superhero?' Then, swift, silent… gift-grabbing action?" Molly sighed. "Essentially, yes. Just...try not to make eye contact. And for the love of all that is holy, Bill, keep the badger out of sight." She tapped the floor plan one last time. "Operation Sparkle Thief commences at three o'clock. Let's go make some money...or at least, avoid getting arrested."
created_at
"2026-02-20T19:54:58.615Z"
2/20/2026, 7:54:58 PM
model
"google/gemma-3-12b-it"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
382
moderation_latency
(null)
generation_time
13359
tokens_prompt
93
tokens_completion
799
native_tokens_prompt
75
native_tokens_completion
691
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.00009283
router
(null)
provider_responses
0
endpoint_id
"eb06dc92-5a16-47ec-a776-6ef956457c47"
id
"chatcmpl-Rz1z0UAEN3IYDwoT4Bfrp7vK"
is_byok
false
latency
55
model_permaslug
"google/gemma-3-12b-it"
provider_name
"DeepInfra"
status
200
api_type
"completions"
id
"gen-1771617298-dCJD7k8yaMXWQzBqrZqA"
upstream_id
"chatcmpl-Rz1z0UAEN3IYDwoT4Bfrp7vK"
total_cost
0.00009283
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepInfra"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
460
0.00%
Dialogue to Total Word Ratio
Ratio: 69.57%, Deviation: 39.57%
neededClean
false
wordsTotal
460
wordsDialogue
320
0.0000%