NC Bench
Overview
About
Tests
Models
Model
Gemini 2.5 Pro
Test
Dialogue tags
Scenario
Write 500 words with 50% dialogue
Run ID
2025-11-18-01KACP6XA87MYW2852Y63SJ0XC
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The van smelled of stale crisps and failed ambition. Molly stared through a pair of high-powered binoculars, her brow furrowed in concentration. “Okay, they’re singing ‘Happy Birthday.’ We have a three-minute window before the sugar-high kicks in and they become unpredictable.” Bill, meanwhile, was meticulously cleaning his fingernails with a spork. “Is it caterpillar cake? I bet it’s caterpillar cake. Little Timmy seems like a caterpillar cake kind of kid.” “Focus, Bill. Phase one is bypassing the primary obstacle.” “You mean Kevin’s dad, Dave? The one who’s manning the barbecue?” “Exactly. He’s territorial. Your job is to engage him. Ask about tongs. Men like Dave love talking about their tongs.” “My tongs are better. Wider grip.” Molly lowered the binoculars and fixed him with a withering glare. “This is not a contest. This is a precision operation. While you’re on ‘tong-talk,’ I’ll slip past the patio door and secure the payload.” “The payload being a pile of poorly wrapped presents?” “The payload being our rent for the month, hopefully in the form of a large LEGO set. The resale value is surprisingly high.” Bill sighed, putting the spork down. “What’s the exit strategy?” “The bouncy castle. It provides excellent cover. I’ll create a diversion by ‘accidentally’ unplugging it.” “You want to cause a slow, whimpering deflation in a castle full of six-year-olds? That’s cold, Molly. Even for you.” “It’s effective. They’ll be too busy being moderately inconvenienced to notice me hoisting a bag of loot over the fence.” “And what if Patches sees you?” “Who’s Patches?” Bill pointed a thumb toward the garden, where a sad-looking clown was failing to make a balloon animal that resembled anything other than a pathetic pink worm. “Patches the Clown. He’s got eyes everywhere. Soulless, painted-on eyes.” “The clown is a non-factor. He’s a professional. He won’t care.” “He might. He looks like he’s one bad party away from a complete psychological break. I don’t want to be the cause of that.” “You won’t be. The slow deflation will be. Now, are you ready?” “Almost.” “What is it now?” Bill picked up a small paper party bag from the dashboard. “Just wondering if there are any of those little candy bracelets in the party bags. I love those.” “Bill, we are stealing the party bags. You can check for bracelets later.” “Right. Of course. For a moment there I forgot. Tongs, bouncy castle, soulless clown. Got it.” Molly sighed, raising the binoculars again. “Just try not to get distracted by the sausage rolls this time.” “No promises.”
created_at
"2025-11-18T23:54:05.772763+00:00"
Tuesday, November 18th, 2025 at 11:54:05 PM GMT+00:00
model
"google/gemini-2.5-pro"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
2201
moderation_latency
(null)
generation_time
21683
tokens_prompt
68
tokens_completion
1171
native_tokens_prompt
67
native_tokens_completion
2275
native_tokens_completion_images
0
native_tokens_reasoning
1637
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"STOP"
usage
0.02283375
api_type
"completions"
id
"gen-1763510023-jApnOnygPaCCY9b7C7fi"
upstream_id
(null)
total_cost
0.02283375
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Google AI Studio"
Evaluation details
Result
Evaluator
Details
Meta Data
0.0000%
Matches word count
n/a
neededClean
false
words
431
0.0000%
Dialogue to Total Word Ratio
Ratio: 77.96%, Deviation: 27.96%
neededClean
false
wordsTotal
431
wordsDialogue
336
0.0000%