NC Bench
Overview
About
Tests
Models
Vendors
Compare
Run: 2026-02-25-01KJ953WZFCZZY7QYK90STX48Z
Run Details
Model:
Gemini 2.5 Flash Lite (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
27.1s
Cost:
$0.002278
Provider:
Google AI Studio
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly smoothed the crumpled blueprint, which was frankly just a napkin with a crayon-drawn house. The smell of stale coffee warfare hung heavy in the cramped room beneath the abandoned pet shop. Bill, perched on an overturned bucket, was meticulously trying to get a bright pink unicorn to shed. It wasn't a particularly successful endeavour. He hummed a tuneless ditty that sounded suspiciously like a warped version of Twinkle, Twinkle Little Star. Alright, Bill, Molly began, tracing a smudged circle with her finger. Operation Sparkle Pony is a go. Or rather, Operation Pile of Stuff. It's the Henderson birthday bash. Little Timmy, turning eight. They've got the bouncy castle booked, the cake is rumoured to be shaped like a dragon, and the presents? Oh, Bill, the sheer volume. Bill paused his unicorn abuse. Dragon cake? Nice. Is it spicy? It's frosting, Bill, Molly sighed, pinching the bridge of her nose. Focus. This haul could set us up for months. We’re talking Lego sets, video games, probably enough stuffed animals to start a small, disreputable zoo. Can I have the dragon cake? Bill asked, hopeful. No. You can have a novelty toothbrush if we're lucky. The key here is infiltration. The parents will be distracted by Timmy's sugar-fueled rampage and Mildred Henderson's notoriously potent punch. We go in disguised as… I need to figure that part out. Costumed entertainers? Bill suggested, his eyes widening. I could be a giant, melancholy badger. It's a niche market, but I know it plays well with the under-ten crowd. They appreciate interpretive weeping. Molly stared at him. A badger. Right. No, I was thinking more along the lines of… delivery people. Or lost relatives. Something mundane. We need to blend in, grab the loot while they're watching Timmy try to eat the bouncy castle, and be gone before the thank-you notes are even conceived. But what if they ask us for ID? Bill countered, now attempting to make the unicorn's mane stand on end. My fake ID is for a dentist in Slough. They might get suspicious if a badger asks to examine their molars. We're not being badgers! We're going to be subtle. We'll use the back entrance, time it for the cake cutting. You take point, clear the path through the inevitable minefield of discarded socks and juice boxes. I'll handle the packing, Molly instructed. And for heaven's sake, try not to look like you're about to abscond with their entire childhood. Bill nodded gravely, finally snapping the unicorn's horn off. Got it. Subtle. Like a ninja. A ninja who really, really likes shiny new things. He grinned, a gap-toothed, terrifyingly eager expression. So, stealth badger it is then? Molly just closed her eyes, and the faint scent of burnt toast seemed to get stronger.
created_at
"2026-02-25T01:02:50.357Z"
2/25/2026, 1:02:50 AM
model
"google/gemini-2.5-flash-lite"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
988
moderation_latency
(null)
generation_time
27048
tokens_prompt
77
tokens_completion
5401
native_tokens_prompt
67
native_tokens_completion
5679
native_tokens_completion_images
0
native_tokens_reasoning
5059
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"STOP"
usage
0.0022783
router
(null)
provider_responses
0
endpoint_id
"ce839073-aa24-4f29-8358-15b319bd05ec"
is_byok
false
latency
988
model_permaslug
"google/gemini-2.5-flash-lite"
provider_name
"Google AI Studio"
status
200
api_type
"completions"
id
"gen-1771981370-JQiYVMVRrFMNXHGjsF6t"
upstream_id
(null)
total_cost
0.0022783
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Google AI Studio"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
462
0.00%
Dialogue to Total Word Ratio
Ratio: 0.00%, Deviation: 50.00%
neededClean
false
wordsTotal
467
wordsDialogue
0
0.0000%