NC Bench
Overview
About
Tests
Models
Vendors
Compare
Run: 2026-02-25-01KJ95P9RKMD6YKVM4BKKQWJ42
Run Details
Model:
Gemini 2.5 Flash Lite (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
1m 4s
Cost:
$0.004686
Provider:
Google AI Studio
Evaluation summary
Score
Evaluator
Details Link
93.65%
Matches word count
(details)
1.45%
Dialogue to Total Word Ratio
(details)
47.5510%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The mildewed wallpaper of their clandestine meeting room peeled in mournful strips, resembling the discarded husks of burnt-out dreams. Dust motes danced like miniature, impoverished sprites in the single shaft of light piercing the grimy window. Molly, radiating an almost unnerving, can-do energy, wielded a pink crayon like a weapon of mass instruction against the flimsy battlefield of a stolen napkin. Across the scarred pallet table, Bill nervously gnawed on the untouched end of a particularly sickly-looking doughnut, his eyes darting towards the door as if expecting an eight-year-old SWAT team to burst in at any moment. The napkin itself was a masterpiece of criminal intent: a crudely drawn bouncy castle, surrounded by stick figures labelled “Threats (Parents)” and small, tantalizing Xs marked “Potential Loot (Presents)”. A strategically placed, albeit smudged, drawing of a glitter bomb promised pandemonium. Molly tapped the crayon. “Right,” she announced, her voice crisp. “Operation: Birthday Bonanza. Infiltration.” Bill swallowed. “Subtle? Kids have radar for trouble. Their screams are sonic weapons.” Molly waved him off, her mind already miles ahead, possibly designing a tiny, wearable disguise for a hamster. “We blend. We have assets.” She nodded towards an overflowing recycling bin that seemed to double as a graveyard for failed celebrations. “Cousin Brenda’s kid’s party. Leftovers.” Bill’s brow furrowed. “Brenda’s kid? Wept for hours over frosting colour. Emotional range of a storm cloud.” “Precisely,” Molly confirmed, a predatory glint in her eye. “Perfect operational cover. Brenda misplaced invitations. We acquire them, arrive late, blame the railways, and offer to ‘help’.” “Help what? Sticky fingers, shrieking voices, passive-aggressive parents.” “Clear up. Manage the loot. Relocate it to a more secure, private holding facility,” Molly elaborated, her eyes gleaming. “Parents get overwhelmed. We’ll be the calm, capable ones. Inventory management experts.” Bill shuddered, a vision of glitter-bombed toddlers flashing behind his eyes. He found actual children far more unpredictable and terrifying than spectral lasers or ancient pressure plates. “Kids are more unpredictable than lasers. Whose birthday? Does it matter? Just a buffet.” Molly countered, already sketching a pair of enormous, lurid green foam fingers with intense concentration. “Does it look like it matters? These are our primary tools. Distraction. And small package palming. My cousin swore by them for, ah, *managing* toddlers.” Bill eyed the monstrously oversized green digits warily. “And the loot goes where? Hollowed-out teddy bears?” “For concealment. My cousin had spares from a previous… unfortunate incident,” Molly added vaguely, her mind already fixed on the prize. “Think of the haul, Bill. Lego. And the drone. Very exclusive. State of the art.” Bill sighed again, the faint smell of stale sugar doing little to boost his wavering morale. This whole operation felt precariously balanced on the edge of a sugar rush and a tantrum. “So, stolen invites, glitter, foam hands, hollowed-out bears, pilfering presents from kids and parents, for a drone?” Molly grinned, dusting imaginary glitter from her sleeve with a flourish. “Exactly. Simple. Elegant. Highly effective.”
created_at
"2026-02-25T01:12:53.289Z"
2/25/2026, 1:12:53 AM
model
"google/gemini-2.5-flash-lite"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1061
moderation_latency
(null)
generation_time
63633
tokens_prompt
77
tokens_completion
13022
native_tokens_prompt
67
native_tokens_completion
11699
native_tokens_completion_images
0
native_tokens_reasoning
11018
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"STOP"
usage
0.0046863
router
(null)
provider_responses
0
endpoint_id
"ce839073-aa24-4f29-8358-15b319bd05ec"
is_byok
false
latency
1061
model_permaslug
"google/gemini-2.5-flash-lite"
provider_name
"Google AI Studio"
status
200
api_type
"completions"
id
"gen-1771981973-1X0qZiOkIw8hpzWI1iPJ"
upstream_id
(null)
total_cost
0.0046863
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Google AI Studio"
Evaluation details
Result
Evaluator
Details
Meta Data
93.65%
Matches word count
n/a
neededClean
false
words
491
1.45%
Dialogue to Total Word Ratio
Ratio: 39.20%, Deviation: 9.20%
neededClean
false
wordsTotal
500
wordsDialogue
196
47.5510%