NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-20-01KHY8NZJNBAQZGGPYM9VD6W2V
Run Details
Model:
Gemma 3 4B
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
7.7s
Cost:
$0.000054
Provider:
DeepInfra
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
“Right, so, the glitter cannons first?” Molly asked, tapping a manicured nail against a meticulously drawn floor plan of ‘Princess Penelope’s Perfectly Pink Party’. Bill chewed on the end of his pen, a smear of blue ink blossoming on his upper lip. “Glitter cannons? Seriously? We’re going for presents, not a rave.” “Details, details,” she waved a dismissive hand. “A little distraction. Kids love glitter. It'll slow them down. Think of it as… a confetti-based tactical delay.” “Tactical delay? You’ve been reading too many spy novels,” Bill grumbled. “Just get the damn presents.” “Okay, okay. Presents. But glitter. It's *atmospheric*.” Molly punctuated this with a little flourish of her hand. “Look, the cake is in the corner, all pink and sparkly. It’s practically begging to be observed.” “The cake isn’t begging for anything,” Bill said dryly. “It’s a sugar-laden monstrosity. And we’re after the stuffed unicorns, Molly. Remember the brief?” “Of course I remember the brief!” she snapped, though her voice lacked conviction. “Operation Rainbow Riot is still firmly in my brain. Pink frosting, glistening wrappers, the sheer volume of plastic joy… irresistible.” “Irresistible to a child, maybe. Not to a professional thief.” “You’re being unnecessarily pessimistic.” Molly unfolded a small photograph of a particularly plush unicorn, its mane braided with ribbons. “This, Bill, this embodies pure, unadulterated wealth. We're talking serious resale value.” “Resale value? Molly, we’re stealing toys for five-year-olds.” “Think of it as… early investment. We’ll pawn them, flip them. It’s a sophisticated system!” Bill sighed, running a hand through his already messy hair. “Alright. Glitter cannons. Unicorns. What about the parents? Are we blending in?” “I’ve got a princess dress. A very sparkly one. And a tiara. Needs more sparkle, actually.” “You’re wearing a tiara?” “It’s crucial to the illusion! I’m Penelope’s aunt Mildred. A slightly eccentric, excessively glittery aunt Mildred.” “You sound terrifying.” “I am terrifyingly fabulous! Look, they’ll be focused on the bouncy castle and the face painting. Minimal perimeter security.” “Minimal perimeter security? That’s your assessment?” “It’s an assessment! And it’s a *good* assessment. Now, about distracting them with the cannons... should we go for a full-frontal assault, or a more… gradual deployment?” Bill stared at the floor plan, then back at Molly, a flicker of reluctant amusement in his eyes. “Gradual. Definitely gradual. And maybe, just maybe, we grab a cupcake.”
created_at
"2026-02-20T19:33:29.825Z"
2/20/2026, 7:33:29 PM
model
"google/gemma-3-4b-it"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
162
moderation_latency
(null)
generation_time
7525
tokens_prompt
93
tokens_completion
733
native_tokens_prompt
75
native_tokens_completion
636
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.00005388
router
(null)
provider_responses
0
endpoint_id
"d3de3bd4-81bc-48fb-924f-2a87b2a36e75"
id
"chatcmpl-RfHraK6VXSnDxolOTD1qVCc4"
is_byok
false
latency
48
model_permaslug
"google/gemma-3-4b-it"
provider_name
"DeepInfra"
status
200
api_type
"completions"
id
"gen-1771616009-EP81kdwY7N2PD7BZSwIA"
upstream_id
"chatcmpl-RfHraK6VXSnDxolOTD1qVCc4"
total_cost
0.00005388
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepInfra"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
394
0.00%
Dialogue to Total Word Ratio
Ratio: 74.37%, Deviation: 24.37%
neededClean
false
wordsTotal
394
wordsDialogue
293
0.0000%