NC Bench
Overview
About
Tests
Models
Vendors
Compare
Run: 2026-02-25-01KJ95XNQ2156PG7P6JYDKC5YA
Run Details
Model:
Gemini 2.5 Flash Lite (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
1m 6s
Cost:
$0.004317
Provider:
Google AI Studio
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The faded linoleum of Molly’s kitchen floor was, as usual, a landscape of past failures and present ambition. Crumpled takeaway menus competed for space with a meticulously annotated map of a suburban bungalow, labelled ‘Operation: Tiny Treasures’. Molly herself was hunched over, tracing a perilous path with a Biro-scarred finger, her expression one of grim determination usually reserved for untangling exceptionally knotted headphone cords. She was calculating ingress points, factoring in the notorious ‘Barnaby’ (described by their informant as ‘a furry tornado with a biscuit addiction’) and the formidable ‘Grandma Guard’ (a seventy-something knitting enthusiast with eyes like a hawk). The objective was deceptively simple: a children’s birthday party, a veritable mountain of brightly wrapped parcels, and a chaotic two-hour window of pure, unadulterated, sugar-fuelled mayhem. The sheer audacity of the plan appealed to Molly’s refined sense of criminal absurdity. Bill, meanwhile, was diligently polishing a single, pristine fork with a tea towel, its metallic gleam reflecting the dim kitchen light. "And the target is… kids' presents, Molly?" Bill ventured, his voice a low rumble. "Seems a bit… soft. Our usual calibre, this isn't." Molly snorted, her expression barely shifting. "Soft? Bill, we're performing vital de-cluttering services for the affluent. Think of it as specialised urban renewal for the toy box, saving parents from assembling plastic monstrosities. We're not just thieves; we're problem solvers. Barnaby is manageable with tactical biscuit deployment. Grandma Guard is predictable; her habits are our ally. Your part is critical: maintain Barnaby's focus with cheddar bribes and initiate the diversion. It’s essentially psychological warfare on a canine." Bill held the fork aloft, admiring its shine. "Psychological warfare. Right. So, lots of tiny, noisy distractions then? And the main haul?" "The living room, obviously," Molly confirmed, tapping the drawing. "Infiltrate as late-arriving guests, disguised as catering. I extract the parcels. You, create controlled pandemonium. Precisely timed, Bill. One slip–" "–And we're trapped in a saccharine meltdown with over-excited toddlers and their bouncy balls, I know," Bill finished, a surprisingly astute echo. "Got it. Symphony of timing, ballet of pilfering. Classic Molly. Let’s hope this isn’t another squirrel-related incident."
created_at
"2026-02-25T01:16:54.888Z"
2/25/2026, 1:16:54 AM
model
"google/gemini-2.5-flash-lite"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1159
moderation_latency
(null)
generation_time
66123
tokens_prompt
77
tokens_completion
15598
native_tokens_prompt
67
native_tokens_completion
10775
native_tokens_completion_images
0
native_tokens_reasoning
10291
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"STOP"
usage
0.0043167
router
(null)
provider_responses
0
endpoint_id
"ce839073-aa24-4f29-8358-15b319bd05ec"
is_byok
false
latency
1159
model_permaslug
"google/gemini-2.5-flash-lite"
provider_name
"Google AI Studio"
status
200
api_type
"completions"
id
"gen-1771982214-3LQjBdH3WyViPstRP6dI"
upstream_id
(null)
total_cost
0.0043167
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Google AI Studio"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
352
0.00%
Dialogue to Total Word Ratio
Ratio: 50.70%, Deviation: 20.70%
neededClean
false
wordsTotal
359
wordsDialogue
182
0.0000%