NC Bench
Overview
About
Tests
Models
Vendors
Compare
Run: 2026-02-25-01KJ957VHA6S6GNHB9KZNF4CYX
Run Details
Model:
Gemini 2.5 Flash (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
25.6s
Cost:
$0.010925
Provider:
Google AI Studio
Evaluation summary
Score
Evaluator
Details Link
43.38%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
21.6892%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly unfurled the blueprint with a dramatic flourish, nearly knocking over Bill’s lukewarm instant coffee. Hand-drawn in crayon, it depicted a vibrant suburban backyard, complete with a terrifyingly cheerful sun and stick-figure children. Bill grunted, adjusting his perpetually greasy spectacles. A yellow rubber duck sat sentinel on the table, witnessing their nefarious ambitions. The cramped apartment, smelling of stale pizza, thickened with Molly’s excitement. “Alright, Bill. Phase one: Infiltration. The ‘Princess Sparkle’ bouncy castle. Our primary breach point. Think of it as a fortified perimeter. Its very existence mocks us, but also offers opportunity.” “A bouncy castle? You’re serious? We’re breaking into a kid’s party, Molly. Not Fort Knox. And the name alone gives me hives.” “Precisely! The sheer absurdity is our camouflage. Who would suspect two master criminals meticulously planning the theft of… a LEGO Millennium Falcon? And possibly a talking puppy doll?” She tapped a colourful blob on the crayon map. This was the bouncy castle, rendered with disturbing optimism. The map highlighted a pink table laden with imaginary cakes, labeled “The Jewel Repository.” A red ‘X’ marked a suspiciously cheerful clown. “The clown. What’s his deal? Asset or obstacle? Last time, the mime nearly took my eye out with a non-existent banana. I’m still traumatized.” “Mr. Sprinkles. Known for mediocre balloon animals and unverified access to the main power grid for the outdoor disco ball. We neutralise him. Or, perhaps, leverage his unique skillset. Distraction via juggling chainsaws – a classic.” Bill shuddered at the mime’s memory, a dark shadow in Molly’s brightly coloured plans. He picked up a miniature plastic soldier, a tiny green grenadier, twirling it between his fingers. The silence stretched, punctuated by Molly’s incessant tapping and the distant wail of a police siren – probably not for them, yet. Molly leaned back, eyes gleaming with the manic intensity of a general reviewing battle plans. “Our objective,” she declared, pointing at a small pile of crudely drawn boxes, “is the Present Pile. Estimated value: invaluable childhood joy, potential collector’s items, and one robustly wrapped toy poodle, if my intelligence is correct. We approach under cover of ‘Pin the Tail on the Donkey,’ using its chaotic energy to mask our movements.” Bill slowly lowered the plastic soldier. “So, to clarify. We’re risking jail time, allergic reactions to cheap cake, and the fury of sugar-crazed six-year-olds for… a robustly wrapped toy poodle. And you called this ‘Operation Birthday Bling’?” He sighed, running a hand over his thinning hair. The rubber duck stared, silently judging his choices. This truly felt like their lowest point. Molly snatched the soldier, placing it strategically near the bouncy castle. “It *is* Operation Birthday Bling, Bill. And it shall be glorious. The target is soft, the rewards psychological, the glory… immeasurable. Besides,” she added with a mischievous glint, “I hear they’re serving organic carrot sticks. We can grab some on the way out. For health reasons, obviously.”
created_at
"2026-02-25T01:04:59.951Z"
2/25/2026, 1:04:59 AM
model
"google/gemini-2.5-flash"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1323
moderation_latency
(null)
generation_time
25606
tokens_prompt
77
tokens_completion
5376
native_tokens_prompt
67
native_tokens_completion
4362
native_tokens_completion_images
0
native_tokens_reasoning
3687
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"STOP"
usage
0.0109251
router
(null)
provider_responses
0
endpoint_id
"16c01e26-9b91-4d22-b487-ed35634b017d"
is_byok
false
latency
1323
model_permaslug
"google/gemini-2.5-flash"
provider_name
"Google AI Studio"
status
200
api_type
"completions"
id
"gen-1771981499-d3UgDfPvGVcutU6fiOEq"
upstream_id
(null)
total_cost
0.0109251
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Google AI Studio"
Evaluation details
Result
Evaluator
Details
Meta Data
43.38%
Matches word count
n/a
neededClean
false
words
483
0.00%
Dialogue to Total Word Ratio
Ratio: 54.81%, Deviation: 24.81%
neededClean
false
wordsTotal
489
wordsDialogue
268
21.6892%