NC Bench
Overview
About
Tests
Models
Vendors
Compare
Run: 2026-02-25-01KJ9545C5XRM4DDXJKH41R43B
Run Details
Model:
Gemini 2.5 Flash Lite (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
15.1s
Cost:
$0.001320
Provider:
Google AI Studio
Evaluation summary
Score
Evaluator
Details Link
99.98%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
49.9920%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The blueprint was, generously speaking, a crime scene. Crumpled, stained with what appeared to be grape juice and possibly tears, it featured stick figures locked in combat with a giant, vaguely menacing sun. Molly poked a suspect smudge. "Bill, are we sure this is the right intel? Because this looks like it was drawn by a badger on speed, not a reconnaissance expert." Bill beamed, his chin practically resting on the paper. "That's Kevin's work, Mol. Expert recon. Four years old, impeccable accuracy. Said he’d draw us the 'danger zones' and the prize. See this big yellow blob with angry eyes?" He indicated the sun. "That's the bouncy castle. Apparently, it's a hotbed of uncontrolled energy. And *this*," he pointed to a lopsided 'X' near a scribbled house, "that's the treasure hoard." "The 'treasure hoard'," Molly repeated flatly, eyeing a splodge of blue crayon labelled 'POISON'. Molly mentally filed it under 'likely to be cake'. "And the opposition? Who are we up against in this 'treasure room'?" "The Grown-up Grumbles," Bill whispered, leaning in. "Parents. Nana Jean is the main threat – she's the one with the stare that could melt a disco ball." He gestured to a particularly ferocious scribble. "And over here, the 'Chaos Minions' – the kids. They’re unpredictable, but easily mollified by small, brightly coloured objects." Molly sighed, the scent of stale donut clinging to Bill’s jacket doing her no favours. "So our plan is to waltz into a den of miniature hooligans, mug Nana Jean’s glare, and pilfer what looks like a small nation’s worth of plastic toys, all based on a child's fever dream. Again." "It’s not just toys, Mol, it’s *potential*!" Bill enthused. "Think of the sheer, unadulterated glee on their little faces on Christmas morning. Then think of our payday. It’s practically a public service." He tapped the plan. "Entry via the conservatory. The Grumbles will be distracted by the barbecue smoke and the sheer volume of noise. Operation: Merry Mayhem is a go during the height of the 'Musical Statues' charade. Classic lull for the sentinels." "Musical Statues," Molly mused. "And the extraction? We’re not exactly inconspicuous, Bill. Imagine trying to discreetly burgle a Paw Patrol tower while a chorus of 'Twinkle Twinkle Little Star' is being warbled off-key." "That," Bill declared, puffing out his chest, "is where the advanced diversionary tactics come in. Kevin’s mum, bless her oblivious soul, is making a Victoria sponge. We'll 'accidentally' knock it onto the floor while pretending to admire it, creating a sticky, catastrophic mess. While the Grumbles are absorbed in the sponge-based calamity and the subsequent clean-up, we grab the loot. Smuggled in specially adapted toy sacks. Nobody suspects a toy sack." Molly looked at the drawing again – the angry sun, the poison cake, the chaotic scribbles. A slow, criminal smile spread across her face. "You know what, Bill? For a heist based on airborne sponge and operative toddlers, that might actually work. Just promise me we split the actual gold coins afterwards."
created_at
"2026-02-25T01:02:58.955Z"
2/25/2026, 1:02:58 AM
model
"google/gemini-2.5-flash-lite"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1043
moderation_latency
(null)
generation_time
15075
tokens_prompt
77
tokens_completion
1979
native_tokens_prompt
67
native_tokens_completion
3284
native_tokens_completion_images
0
native_tokens_reasoning
2565
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"STOP"
usage
0.0013203
router
(null)
provider_responses
0
endpoint_id
"ce839073-aa24-4f29-8358-15b319bd05ec"
is_byok
false
latency
1043
model_permaslug
"google/gemini-2.5-flash-lite"
provider_name
"Google AI Studio"
status
200
api_type
"completions"
id
"gen-1771981378-drQ7x8thYCFQJcyACvdT"
upstream_id
(null)
total_cost
0.0013203
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Google AI Studio"
Evaluation details
Result
Evaluator
Details
Meta Data
99.98%
Matches word count
n/a
neededClean
false
words
498
0.00%
Dialogue to Total Word Ratio
Ratio: 72.91%, Deviation: 42.91%
neededClean
false
wordsTotal
502
wordsDialogue
366
49.9920%