NC Bench
Overview
About
Tests
Models
Vendors
Compare
Run: 2026-02-25-01KJ94Y23QCP89G2P55YHQRWAX
Run Details
Model:
Gemini 2.5 Flash (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
17.7s
Cost:
$0.006923
Provider:
Google AI Studio
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Bill gestured grandly with a half-eaten Danish, crumbs raining onto a faded map of a suburban cul-de-sac. Molly, slouched opposite him in their dingy, suspiciously parked van, just exhaled slowly. "Right. The target. Codename: 'Sugar Rush.'" "Are we really doing this, Bill? A birthday party? Aren't we a little… old for raiding pinatas?" "Pinatas? Molly, think bigger. Think… acquisition of high-value, gift-wrapped assets. These aren’t just cheap plastic toys, my dear, these are potential trade goods, the currency of childhood!" "They’re action figures that snap if you look at them funny and glitter-infused slime. Nobody’s giving us a cut of anything for that." "Ah, but the sentiment! The sheer *volume* of sentiment! And don’t forget the gift cards. The beautiful, plastic rectangles of pure, unadulterated spending power." He tapped the map. "Okay, phase one: Infiltration. We’ll need a disguise. Something… inconspicuous." "Like two adults in a room full of eight-year-olds? We stand out like a sore thumb covered in more glitter." "Precisely! Camouflage through awkwardness. I’m thinking… 'party entertainers.' Balloon artist, perhaps?" "You can’t even tie your shoelaces without tangling them into a Gordian knot, Bill. How are you going to shape a poodle out of a latex tube?" "Details, details. That’s why we have a 'division of labor.' You, my dear, will be 'Clancy the Clown's Quiet Assistant.'" "I am not wearing face paint. Or oversized shoes. And if I have to listen to 'Baby Shark' one more time, I’m cutting loose ends, and I mean *literally*." "Fine! You’ll be the 'party photographer.' Get in, document the 'joyful exchange,' then… phase two: extraction." "Extraction of what, exactly? Stuffed animals made by underpaid factory workers?" "And how do we get past the bouncy castle? Those things are deathtraps. One wrong jump and you’re in a critical care unit with a twisted ankle and a concussion from a rogue six-year-old." "The bouncy castle is our *diversion*, Molly. While the brats are bouncing themselves into a sugary oblivion, you, as the 'photographer' and I, as the whatever-I-end-up-being, will 'secure the perimeter' around the present table." "Secure the perimeter around the gift-wrapped plastic rubbish, you mean." "We gather them. Discrete laundry bags, perhaps. Then, the 'emergency' exit. 'Accidental' fire alarm pull. 'Sudden' case of the allergic reaction to gluten-free cupcakes. Whatever it takes." "And the parents? You think they’re just going to let two strangers waltz out with a sack full of presents because you pulled the fire alarm?" "That’s why we need a fast exit strategy. The getaway car parked inconveniently close to the back gate. A quick dash. Minimal exposure. Surgical precision." "Surgical precision for stealing a Peppa Pig playset. You’ve really lost the plot, haven’t you?" "It’s about the principle, Molly! We’re reclaiming the joy of giving… to ourselves." "You’re going to get us caught by a PTA mom with a severe allergy to fun. And an even more severe allergy to strangers touching their kids’ presents." "Optimism, Molly! This is a low-risk, high-reward endeavor!" "The reward being a collection of poorly crafted glitter art and a severe sugar crash from leftover cake?" "And the story, Molly! Oh, the story! Imagine telling this one to our grandkids!" "I don't *have* grandkids, Bill. I have a long-standing warrant in three states." "Exactly! Adds to the mystique! Are you in, or are you out, partner?" Molly sighed, scrubbing a crumb from the van’s dash. "Fine. But if I end up in a clown suit, your life insurance policy is getting a very thorough review."
created_at
"2026-02-25T00:59:39.006Z"
2/25/2026, 12:59:39 AM
model
"google/gemini-2.5-flash"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1425
moderation_latency
(null)
generation_time
17679
tokens_prompt
77
tokens_completion
3735
native_tokens_prompt
67
native_tokens_completion
2761
native_tokens_completion_images
0
native_tokens_reasoning
1849
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"STOP"
usage
0.0069226
router
(null)
provider_responses
0
endpoint_id
"16c01e26-9b91-4d22-b487-ed35634b017d"
is_byok
false
latency
1425
model_permaslug
"google/gemini-2.5-flash"
provider_name
"Google AI Studio"
status
200
api_type
"completions"
id
"gen-1771981179-Z7ZVf47tMJjbzY38WzZz"
upstream_id
(null)
total_cost
0.0069226
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Google AI Studio"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
582
0.00%
Dialogue to Total Word Ratio
Ratio: 92.35%, Deviation: 22.35%
neededClean
false
wordsTotal
601
wordsDialogue
555
0.0000%