NC Bench
Overview
About
Tests
Models
Vendors
Compare
Run: 2026-02-25-01KJ95GPJB1EAM9EXEQ8PBPQGJ
Run Details
Model:
Gemini 2.5 Flash Lite (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
13.2s
Cost:
$0.001438
Provider:
Google AI Studio
Evaluation summary
Score
Evaluator
Details Link
3.62%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
1.8117%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The bouncy castle vibrated with the unrestrained joy of a dozen sugar-fueled seven-year-olds. Molly squinted through the smudged window of their strategically parked delivery van, a grimace twisting her lips. Pink streamers drooped from a nearby lamppost like deflated aspirations. A mountain of brightly wrapped boxes sat precariously stacked by the garden shed. This was it. Operation: Santa's Little Swipers. "You sure about this, Mol?" Bill mumbled, clutching a lukewarm coffee. "Seems a bit… low-hanging fruit, doesn't it? And tiny fruit, at that." Molly turned, her eyes glinting. "Bill, 'low-hanging fruit' is how we pay the rent. And the rent is due. Besides, the intel is solid. The prize is rich, and the security is… well, it’s a trampoline with a wobbly ladder." Bill shifted on the worn seat, eyeing the children assembling for a game of 'Pin the Tail on the Donkey' with the intensity of a drill sergeant contemplating a particularly recalcitrant recruit. He pointed a finger, sticky with what might have been yesterday’s jam. "What if," Bill began, leaning closer, his breath fogging the glass, "we use the distraction of that Pin-the-Tail thing? While they're all giggling blindfolded, I go in, dressed as a giant, rogue balloon animal. A T-Rex, maybe. Kids love T-Rexes. They'd flock to me, herd them into the shed, lock 'em in, grab the goods, and be out before anyone noticed the distinct lack of squeaking." Molly didn't even turn. She traced a route on a crumpled garden map. "Bill, that’s the worst idea you’ve had since the goldfish heist. You don't know how to make a T-Rex, you'd probably try to eat the balloons, and we're stealing presents, not organizing a mobile zoo for future witnesses. We need subtlety." She tapped the shed door on her map. "Phase one: The Infiltration. You wait for the cake. Maximum sugar comatose. Then, I deploy the 'Distract-o-matic' – me tripping over a hose pipe, very deliberately. While everyone’s distracted by my theatrical clumsiness, you slip out, blend into the shadows behind the rhododendrons, and perform the swap. Presents out, decoy bags in." "Decoy bags?" Bill perked up. "What's in 'em?" "Empty boxes and old jumpers," Molly said flatly. "The important thing is the *weight*. They won't know the difference until way after we're gone. Now, the tricky part. That little ginger terror with the bright green wellies? He's the birthday boy's designated 'Parental Oversight Unit.' Eyes like a hawk, scream like a banshee. Avoid him at all costs." Bill gulped. "The one with the perpetually sticky hands? Is he… is he already eyeing the spoils?" Molly sighed, pulling out oversized sunglasses and a comically large sunhat from under the dashboard. "That, Bill, is why we wear disguises. Phase two: The Grand Escape – disguised as incompetent parents trying to find their own child. Now, stop looking so guilty and try this hat on."
created_at
"2026-02-25T01:09:49.777Z"
2/25/2026, 1:09:49 AM
model
"google/gemini-2.5-flash-lite"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1034
moderation_latency
(null)
generation_time
13145
tokens_prompt
77
tokens_completion
2109
native_tokens_prompt
67
native_tokens_completion
3578
native_tokens_completion_images
0
native_tokens_reasoning
2874
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"STOP"
usage
0.0014379
router
(null)
provider_responses
0
endpoint_id
"ce839073-aa24-4f29-8358-15b319bd05ec"
is_byok
false
latency
1034
model_permaslug
"google/gemini-2.5-flash-lite"
provider_name
"Google AI Studio"
status
200
api_type
"completions"
id
"gen-1771981789-NGzVFth3x5tQjLJUCAkE"
upstream_id
(null)
total_cost
0.0014379
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Google AI Studio"
Evaluation details
Result
Evaluator
Details
Meta Data
3.62%
Matches word count
n/a
neededClean
false
words
476
0.00%
Dialogue to Total Word Ratio
Ratio: 68.45%, Deviation: 18.45%
neededClean
false
wordsTotal
485
wordsDialogue
332
1.8117%