NC Bench
Overview
About
Tests
Models
Vendors
Compare
Run: 2026-02-25-01KJ95HA9XM9B7VTAVFVP6CWXH
Run Details
Model:
Gemini 2.5 Flash Lite (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
8.4s
Cost:
$0.000877
Provider:
Google AI Studio
Evaluation summary
Score
Evaluator
Details Link
2.01%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
1.0058%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
"Alright, Bill. This is it. Operation 'Sugar Rush'." "Sugar Rush? Molly, it's a birthday party. For, like, seven-year-olds." "Precisely. Maximum chaos, minimum security. They'll be too busy shoving cake in their faces to notice a couple of seasoned professionals." "Professionals? We once tried to steal a garden gnome and got apprehended by a poodle. A poodle, Molly!" "That poodle was surprisingly aggressive. And it had a tiny bell on its collar, threw off our audio surveillance." "Audio surveillance? It barked at us for thirty seconds!" "Details. Focus. The target is the present pile. Post-cake, pre-scream-fest. We need speed." "Speed? What if little Timmy has some really sturdy present wrapping? Like, industrial-grade reinforced cardboard? I don't think my lockpicks can handle that." "We're talking about a bouncy castle and a magician, Bill, not Fort Knox. We create a diversion." "A diversion? What kind of diversion? I'm not dressing as a clown again after the 'Sad Clown Incident' of '19." "No, no. Something subtle. Something… *delicious*. We'll 'accidentally' spill a vat of artisanal fudge sauce near the buffet table. The ensuing sugar-fueled frenzy will cover our tracks. Think sticky. Think chaos." "Fudge sauce? Where are we getting a vat of fudge sauce? Did you steal it from a confectionery convention?" "I have my sources. And my sources happen to have a *very* large supply of ethically sourced, gluten-free, artisanal fudge sauce. You'll be on gift extraction. Think fast, light hands. No sentimental value, just sheer volume." "What if one of the gifts is for the birthday boy? What if he *really* wants that remote-control car? He'll cry. His little face will crumple." "Bill, we're not monsters. We'll leave a single, slightly damaged, off-brand bouncy ball. A memento. Plus, we're targeting *all* the presents. Including the ones the parents bought for themselves and are pretending are for the kid." "That's… actually a good point. I did see Mrs. Higgins eyeing that wine cooler subscription box. Okay, okay. So, fudge sauce, gift grab. What's the exit strategy?" "Through the back garden. We'll use the trampoline as a launchpad over the fence. It's clean, efficient, and frankly, hilarious." "A trampoline launchpad? Are you sure that’s secure? What if the springs are too old? We could end up in Mrs. Henderson's prize-winning petunias. She’s got a reputation for getting litigious about her petunias." "Bill, trust me. This is foolproof. We'll be in and out before the first round of musical chairs even begins. Now, help me find my miniature grappling hook. It's perfect for scaling the bouncy castle." "You have a miniature grappling hook? For bouncy castles?" "You never know when you'll need to make a swift, vertical ascent to claim your prize. Now, stop asking questions and start packing the getaway sacks. And for heaven's sake, try not to smudge the fingerprints on the spork I’m using as a crowbar."
created_at
"2026-02-25T01:10:09.988Z"
2/25/2026, 1:10:09 AM
model
"google/gemini-2.5-flash-lite"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1242
moderation_latency
(null)
generation_time
8417
tokens_prompt
77
tokens_completion
1456
native_tokens_prompt
67
native_tokens_completion
2175
native_tokens_completion_images
0
native_tokens_reasoning
1453
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"STOP"
usage
0.0008767
router
(null)
provider_responses
0
endpoint_id
"ce839073-aa24-4f29-8358-15b319bd05ec"
is_byok
false
latency
1242
model_permaslug
"google/gemini-2.5-flash-lite"
provider_name
"Google AI Studio"
status
200
api_type
"completions"
id
"gen-1771981809-eQCdpSMYAUZPSU95YEDz"
upstream_id
(null)
total_cost
0.0008767
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Google AI Studio"
Evaluation details
Result
Evaluator
Details
Meta Data
2.01%
Matches word count
n/a
neededClean
false
words
475
0.00%
Dialogue to Total Word Ratio
Ratio: 100.00%, Deviation: 30.00%
neededClean
false
wordsTotal
484
wordsDialogue
484
1.0058%