NC Bench
Overview
About
Tests
Models
Vendors
Compare
Run: 2026-02-25-01KJ9549SXN2S88RXGMYPS0YGA
Run Details
Model:
Gemini 2.5 Flash Lite (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
12.9s
Cost:
$0.001084
Provider:
Google AI Studio
Evaluation summary
Score
Evaluator
Details Link
86.38%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
43.1902%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly smoothed the crumpled diagram on the table. It looked suspiciously like a child’s drawing of a house, complete with crayon sunbeams. Bill peered at it, a frown etched between his eyebrows. Molly: So, the target is little Timmy’s seventh birthday bash. Saturday, two PM. Location: fourteen Acacia Avenue. Entry point is the west-facing window, adjacent to the prize-winning petunias. Bill: Petunias? You want us to dive through *petunias*? What if they lead with their faces? Molly: It’s about precision, Bill. Minimal foliage disturbance. The prize is the loot – a veritable cornucopia of unattended joy. Bill: Unattended joy. You make it sound like we’re rescuing puppies, not pilfering presents. Look, this is the tricky part. We go in, grab the gifts, and what? We just waltz out the front door with a sack full of action figures and Lego sets? Molly: Naturally. We’ll need a distraction. I was thinking a coordinated sing-along of ‘Happy Birthday’ at precisely 2:17 PM. While everyone’s focused on the cake and the candles, we make our move. Bill: Sing-along? I can’t carry a tune. And what if one of the tiny tyrants spots us? They have the observational skills of seasoned detectives, but with more shrieking. Molly: We’ll be in disguise. Think subtle. Perhaps, uh, ‘overly enthusiastic party clowns.’ Bill: Clowns? Molly, my face is not built for latex and squeaky noses. The last time I tried, I looked less like a jolly jester and more like a startled badger. Molly: Fine, fine. Not clowns. We’ll just… be very quiet. Ninja-like. Like shadows. Very well-dressed shadows. The parents will be too busy dealing with the sugar rush. Bill: And what about the presents themselves? Are we just going to shove them in a hessian sack? Imagine the glitter. The *shame* of it all. A hardened criminal like me, covered in fairy dust. It’s undignified. Molly: Dignity is for mugs, Bill. This is about profit. Think of the resale value on a brand-new Paw Patrol tower. Plus, the sheer audacity of it! Stealing joy from the mouths of babes. It’s… poetic. Bill: Poetic? It’s criminal. What if Timmy’s dad has a Rottweiler? Or worse, a man-child nephew fuelled by fizzy pop and rage? Molly: The dog is a bichon frise named Princess Fluffybutt. I’ve checked the social media. Essentially a sentient dust bunny. As for the nephew, intel suggests he’ll be engrossed in the video game console. Minimal threat. Bill: Princess Fluffybutt. Right. And you’re sure the window isn’t locked? And that there isn’t, like, a security system made of tin foil and Christmas lights? Molly: All systems nominal. The only real threat is Brenda, Timmy’s mother. She’s got eyes like a hawk and a voice that could curdle milk. We’ll need to be swift. In and out before she notices the deficit in festive cheer. Bill: Brenda. Right. So, we’re robbing a seven-year-old’s birthday party because Brenda might be distracted by a singing caterpillar. This is the lowest we’ve sunk, Molly. Molly: It’s not sinking, Bill. It’s… strategic diversification of our portfolio. Now, the escape route…
created_at
"2026-02-25T01:03:03.491Z"
2/25/2026, 1:03:03 AM
model
"google/gemini-2.5-flash-lite"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1086
moderation_latency
(null)
generation_time
12811
tokens_prompt
77
tokens_completion
2672
native_tokens_prompt
67
native_tokens_completion
2693
native_tokens_completion_images
0
native_tokens_reasoning
1911
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"STOP"
usage
0.0010839
router
(null)
provider_responses
0
endpoint_id
"ce839073-aa24-4f29-8358-15b319bd05ec"
is_byok
false
latency
1086
model_permaslug
"google/gemini-2.5-flash-lite"
provider_name
"Google AI Studio"
status
200
api_type
"completions"
id
"gen-1771981383-hcciDvBKugAevSfgiw6v"
upstream_id
(null)
total_cost
0.0010839
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Google AI Studio"
Evaluation details
Result
Evaluator
Details
Meta Data
86.38%
Matches word count
n/a
neededClean
false
words
511
0.00%
Dialogue to Total Word Ratio
Ratio: 1.92%, Deviation: 68.08%
neededClean
false
wordsTotal
520
wordsDialogue
10
43.1902%