NC Bench
Overview
About
Tests
Models
Vendors
Compare
Run: 2026-02-25-01KJ95M840XBT645ZMZ5180429
Run Details
Model:
Gemini 2.5 Flash (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
1m 41s
Cost:
$0.048895
Provider:
Google AI Studio
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
33.63%
Dialogue to Total Word Ratio
(details)
16.8137%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Bill slammed a badly laminated photograph onto the dented metal table, the image rattling their rusty coffee mugs. It depicted a pristine suburban lawn, a sagging net for volleyball, and a suspiciously gleeful clown. The picture’s forced cheerfulness contrasted sharply with the stale air and peeling paint of their cramped hideout. Molly, meanwhile, didn’t even look up, meticulously polishing a slim lock pick. The little tool glinted under a single bare bulb hanging precariously from the ceiling, reflecting a predatory light in the gloom. "Abernathy's 'Super Cool' 6th Birthday Bash," Bill announced, his voice a low, gravelly growl that felt utterly wasted on such an innocuous target. "Primary target: the present-laden buffet." Molly hummed, finally snapping the pick back into its velvet-lined case with a decisive click. "Six-year-olds? Really, Bill? Have we truly sunk to raiding lemonade stands now?" Bill leaned over a crayon-drawn map, pointing with unwarranted dramatic flair at a smudged circle labeled "PREZZIES." He thumped it for emphasis, a finger hovering next to a crude sketch of a small, fluffy dog adorned with an embarrassing bow. "Don't mock it. Mrs. Abernathy throws lavish events. Intel suggests limited-edition action figures, possibly a remote-control drone. Minimal resistance from Fido. A Chihuahua named ‘Princess Sparkles.’ Apparently, absolutely terrified of squirrels." Molly finally glanced at him, one eyebrow arched precariously high towards her hairline, a silent question in her eyes. "Princess Sparkles. Right. And the primary security force? The aforementioned super cool six-year-old and his sticky-fingered cohorts?" "Excellent question," Bill stated, adjusting the lapel of his imaginary suit with a flourish, his theatricality boundless despite the undeniable absurdity of their mission. "My plan: a two-pronged distraction. You, as 'Professor Bubbles,' the world’s most inept bubbleologist. Create delightful pandemonium." "Delightful pandemonium? I'm a diversion, Bill, not a psychotherapist for hyperactive toddlers," Molly retorted, her gaze sweeping over the map's questionable accuracy with a sardonic glint. "And you? While I dodge airborne juice boxes?" "I, Molly, will be 'Clancy the Caterpillar,' retrieving the goods." He puffed out his chest, resembling a proud, slightly deflated hot air balloon of pure self-importance. "I’ve sourced a surprisingly high-quality caterpillar costume. Excellent maneuverability, built-in insulation." Molly stared, a long, incredulous silence hanging between their professional thief personas, punctuated only by Bill's smug breathing. "You plan to steal presents from children while dressed as a caterpillar?" "It’s genius! They’ll be too entertained to notice the prize pile depleting. Who ever suspects a caterpillar?" Bill’s grin widened, a mix of pure triumph and childish glee. "The bounce house provides excellent distraction, too. High-octane fun means low security awareness." "Or they'll mistake you for a piñata and bludgeon you with a stick," Molly mused, tapping a painted nail on the table. "Couldn't we just pick a high-end department store lock? Less glitter, fewer judgmental toddlers." "But where's the sport? The *challenge*?" Bill leaned closer, eyes gleaming with a manic energy usually reserved for securing legitimate fortunes. "Bragging rights! 'We once hit a birthday party using nothing but bubbles and a caterpillar suit!'" He lowered his voice conspiratorially. "Plus, gourmet cupcakes." That finally caught Molly’s full attention, her practical composure flickering for the first time. "Gourmet cupcakes?" "Chocolate lava, triple fudge, sprinkles. All homemade. Mrs. Abernathy doesn't skimp, apparently." Molly considered this for a moment, a faint, almost imperceptible smile playing on her lips. "Is there a two-cupcake minimum for operatives?" "Of course," Bill affirmed, a triumphant glee returning to his face. "And probably some birthday cake, if we're feeling particularly audacious." "Fine," Molly conceded, the word edged with a reluctant acceptance. "But one speck of glitter on me, and you're on clean-up duty for a month. With a toothbrush." Bill clapped his hands together, his eyes sparkling with almost childlike joy. "Deal! Operation Birthday Blast-Off is a go!" The cramped room seemed to hum with his absurd, infectious enthusiasm, ready for a very silly heist.
created_at
"2026-02-25T01:11:46.056Z"
2/25/2026, 1:11:46 AM
model
"google/gemini-2.5-flash"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1208
moderation_latency
(null)
generation_time
100778
tokens_prompt
77
tokens_completion
20925
native_tokens_prompt
67
native_tokens_completion
19550
native_tokens_completion_images
0
native_tokens_reasoning
18622
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"STOP"
usage
0.0488951
router
(null)
provider_responses
0
endpoint_id
"16c01e26-9b91-4d22-b487-ed35634b017d"
is_byok
false
latency
1208
model_permaslug
"google/gemini-2.5-flash"
provider_name
"Google AI Studio"
status
200
api_type
"completions"
id
"gen-1771981906-aEcxLQxM55iedlJRIbW1"
upstream_id
(null)
total_cost
0.0488951
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Google AI Studio"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
645
33.63%
Dialogue to Total Word Ratio
Ratio: 45.33%, Deviation: 4.67%
neededClean
false
wordsTotal
664
wordsDialogue
301
16.8137%