NC Bench
Overview
About
Tests
Models
Vendors
Compare
Run: 2026-02-25-01KJ94JX6X82YY53XQZEFHX4J1
Run Details
Model:
Gemini 2.5 Flash Lite (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
11.2s
Cost:
$0.000968
Provider:
Google AI Studio
Evaluation summary
Score
Evaluator
Details Link
98.71%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
49.3562%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly adjusted the battered binoculars, her eye twitching. "There. The pink monstrosity. The unmistakable scent of stale cake and existential dread emanating from that front door. Target acquired." Bill grunted, wrestling with a stubborn piece of sticky toffee. "Looks like a standard juvenile assembly point. What's the acquisition window, boss?" "The window is when the shrieking contingent is pacified. According to the intel – Mrs. Henderson bragging on the neighbourhood forum – the high-value targets are being presented now, stage left, near the slightly deflated caterpillar bouncy castle." "Caterpillar bouncy castle. Excellent. Distraction potential is high. They’ll be too busy avoiding suction-to-the-ribs to notice a professional removing the goods," Bill said thoughtfully. "So, the plan is still the cake-related infiltration?" "No, you numbskull! The cake-related infiltration was *last* week's aborted attempt at the toddler spelling bee. This is the ‘Slightly Over-Caffeinated Clown’ gambit. I wear the oversized shoes and the squeaky nose, you… you stay in the van and look menacing." Bill shifted uncomfortably. "Menacing? But I’m not feeling particularly menacing. I had a lovely omelette this morning. Quite Zen. Can’t I be... a menacing balloon animal artist? Or a silently judging mime?" "Just stay put and be ready. My cue is when the birthday boy is distracted by his parents doing that embarrassing 'Happy Birthday' song. That's your signal to begin Operation: Exploded Party Poppers." "Exploded Party Poppers? That sounds… loud. Messy. And potentially traumatising for the target demographic," Bill mused. "Will there be glitter?" "There will be *all* the glitter, Bill. You’re going to sneak to the side fence and set off three industrial-sized party poppers directly under the neighbour’s window. The loud bang, confetti blizzard, potential ensuing scuffle with Mr. Henderson and his prize-winning petunias – it’ll draw every adult outside for at least ninety seconds." "Ninety seconds. That’s ample time for a swift present liberation," Bill confirmed. "So, I blow up the poppers, they come out, you grab the goods. What if they think the poppers are part of the show? What if they *applaud*?" "They will not applaud the neighbour's sudden gardening crisis, Bill. They'll be too busy confiscating Mr. Henderson's secateurs. While they're dealing with the fallout, I'll be making my exit, bag full. Then it's straight back to the van, easy peasy." "Easy peasy. Right. And after the acquisition? Do we, you know, re-wrap them? I always feel bad about the ripping of the paper. Especially the ones with the shiny dinosaurs." Molly sighed, adjusting her imaginary oversized bow tie. "We sell them, Bill. That's the point of a heist. Now, do you have the lock picks I gave you? The tiny ones that look like sugar strands?" "Sugar strands! Yes! I have them right here! They're quite delicious, actually. Do you think if I taste-test one now, it'll look suspicious if I don't finish it later?" "Just put the sugar strand lock picks in your pocket, Bill. And try not to eat them until *after* we've escaped with the bounty. This isn't a reconnaissance mission for culinary delights, it's grand larceny."
created_at
"2026-02-25T00:53:33.539Z"
2/25/2026, 12:53:33 AM
model
"google/gemini-2.5-flash-lite"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1334
moderation_latency
(null)
generation_time
11159
tokens_prompt
77
tokens_completion
2845
native_tokens_prompt
67
native_tokens_completion
2403
native_tokens_completion_images
0
native_tokens_reasoning
1640
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"STOP"
usage
0.0009679
router
(null)
provider_responses
0
endpoint_id
"ce839073-aa24-4f29-8358-15b319bd05ec"
is_byok
false
latency
1334
model_permaslug
"google/gemini-2.5-flash-lite"
provider_name
"Google AI Studio"
status
200
api_type
"completions"
id
"gen-1771980813-ACq3UzSQg9FhPiuBMzQm"
upstream_id
(null)
total_cost
0.0009679
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Google AI Studio"
Evaluation details
Result
Evaluator
Details
Meta Data
98.71%
Matches word count
n/a
neededClean
false
words
506
0.00%
Dialogue to Total Word Ratio
Ratio: 93.02%, Deviation: 23.02%
neededClean
false
wordsTotal
516
wordsDialogue
480
49.3562%