NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-20-01KHY93J5Y6HZPAXY98YXGQ5TS
Run Details
Model:
Gemma 3 12B
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
14.6s
Cost:
$0.000097
Provider:
DeepInfra
Evaluation summary
Score
Evaluator
Details Link
43.38%
Matches word count
(details)
9.83%
Dialogue to Total Word Ratio
(details)
26.6034%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The aroma of stale donuts and desperation hung thick in Bill's cluttered apartment. Stacks of outdated detective novels teetered precariously, threatening to topple onto a threadbare armchair where Molly sat perched, meticulously examining a color-coded spreadsheet. “So, let me get this straight,” she said, tapping a bright pink square, “Operation Sugar Rush involves infiltrating a five-year-old’s birthday party and pilfering the presents?” Bill, a man whose enthusiasm perpetually outweighed his intelligence, beamed. “Precisely! Think of the haul, Molly! Princess castles, remote-controlled cars, a limited-edition unicorn plushie… opportunities galore!” Molly sighed, pushing a stray strand of fiery red hair behind her ear. “Opportunities for arrest, mostly. Bill, this is insane, even for us.” “Insane? It's genius! No security, distracted adults, easily manipulated targets! The children are completely powerless against our charming facade!” He puffed out his chest. “I’m thinking I’ll go as a magician. Pull a rabbit out of a hat, steal a Lego Millennium Falcon. Classic!” Molly pinched the bridge of her nose. “A magician? Really? You can barely tie your shoelaces, let alone make a rabbit disappear.” “Details, details! I’ll practice! I’ll learn a card trick! Maybe a disappearing coin! Or...or a whole birthday cake!" “A disappearing cake is larceny, Bill, not magic.” She pointed at the spreadsheet. "Okay, let's focus. Venue: the community hall on Elm Street. Time: 2 pm, Saturday. Guest list – supposedly around thirty children, potential valuables unknown but estimated at… well, we’ll just call it ‘substantial.’ Your role, Magician Bill, is diversion. Keep the kids occupied. Sing songs, juggle beanbags, whatever it takes. Do *not* engage in any actual magic.” Bill looked crestfallen. “But… the rabbit!” “No rabbit. I'm going with the 'friendly aunt' approach. Slightly eccentric, possibly over-enthusiastic about face painting. Less attention-grabbing than a man in a sparkly waistcoat pulling doves from nowhere.” “Face painting? You? You can barely draw a stick figure.” “I’ll buy a stencil kit. And a lot of glitter. Now, how are we getting in and out? The place is usually swarming with parents armed with cameras and juice boxes." Bill snapped his fingers. “I scouted the place! There's a hatch in the back, used for loading supplies. It's usually unlocked. A little wiggle, a bit of leverage..." He demonstrated with a dramatic shoulder shimmy. Molly gave him a withering look. “And if it’s locked?” “We’ll… we’ll find another hatch! Or a window! Or… we’ll charm the birthday girl. Negotiate a trade. A handful of presents for a promise to teach her how to be a pirate!” Molly closed the spreadsheet with a decisive snap. “Bill, if we get caught trying to negotiate with a five-year-old, I am blaming you entirely.” “Hey! I’m resourceful!” “Resourceful and remarkably prone to spectacularly bad ideas. Just… try to look like you know what you’re doing. And for the love of all that is holy, no rabbits!”
created_at
"2026-02-20T19:40:54.857Z"
2/20/2026, 7:40:54 PM
model
"google/gemma-3-12b-it"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
142
moderation_latency
(null)
generation_time
14547
tokens_prompt
93
tokens_completion
832
native_tokens_prompt
75
native_tokens_completion
721
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.00009673
router
(null)
provider_responses
0
endpoint_id
"eb06dc92-5a16-47ec-a776-6ef956457c47"
id
"chatcmpl-RvdNxWLZLIgdkJws88PqN2jx"
is_byok
false
latency
51
model_permaslug
"google/gemma-3-12b-it"
provider_name
"DeepInfra"
status
200
api_type
"completions"
id
"gen-1771616454-64jVWwExiefuxxvVfIWQ"
upstream_id
"chatcmpl-RvdNxWLZLIgdkJws88PqN2jx"
total_cost
0.00009673
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepInfra"
Evaluation details
Result
Evaluator
Details
Meta Data
43.38%
Matches word count
n/a
neededClean
false
words
483
9.83%
Dialogue to Total Word Ratio
Ratio: 76.81%, Deviation: 6.81%
neededClean
false
wordsTotal
483
wordsDialogue
371
26.6034%