NC Bench
Overview
About
Tests
Models
Model
Gemini 2.5 Pro
Test
Dialogue tags
Scenario
Write 500 words with 30% dialogue
Run ID
2025-11-18-01KACPAYEQDJCHQD767GGYN1CD
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The van smelled of stale crisps and desperation. From their vantage point across the street, Molly watched the target house through a pair of high-powered binoculars. The muffled thud of a bouncy castle pump punctuated the distant, high-pitched shrieks of children. She lowered the binoculars and tapped a professionally laminated blueprint spread across her lap. It was, in fact, just a Zillow floor plan with crayon annotations. “Okay, Bill. Target is active. Ingress point Alpha is the kitchen window, obscured by the wisteria. I've disabled the motion sensor lights.” “They have a bouncy castle,” Bill whispered, his face pressed against the greasy passenger window. “It’s shaped like a dragon. Look, Mol, its nostrils are smoking. Do you think that’s a professional-grade smoke machine?” Molly massaged the bridge of her nose. “Focus. The primary target is the present pile, codenamed ‘Mount Lego.’ We need a clean extraction before the cake-cutting ceremony at fifteen-hundred hours. That’s our window. The sugar coma will hit the parental units hard.” She pointed a well-manicured finger at a red circle on the plan. “Once inside, you create the diversion. I’ve packed the remote-controlled squirrel. While they’re all distracted by ‘Squeaky Nuts’ running up the curtains, I’ll secure the assets.” Bill didn’t respond. He was now fogging up the glass with his steady, mesmerised breathing. He appeared to be trying to mentally calculate the trajectory one would need to clear the garden gnome and land in the dragon’s gaping maw. “I have a better idea,” he said, turning from the window, his eyes alight with a terrible genius. “What if we use the bouncy castle? We get a good bounce, land on the sunroom roof, and rappel down.” Molly’s silence was long and heavy. She remembered the pigeon-loft incident of ‘09, which had also involved Bill, rappelling, and a startling amount of birdseed. “We are not using the bouncy castle. We are not rappelling. We are following the plan. You distract the parents with your ‘lost puppy’ routine, I grab the goods. It’s simple. It’s clean.” “It's a shame. That dragon looks sturdy,” Bill sighed, slumping back in his seat. “Do you think they'll have party bags? I'm hoping for some of those little fizzy sweets and a yo-yo.” “Bill, the only bag you're getting is the swag bag,” Molly said, snapping the blueprint shut. She pulled the ski mask over her face, the cheap wool already making her itch. Bill was still staring wistfully at the bouncy dragon, a small trickle of drool escaping the corner of his mouth. It was going to be a long afternoon.
created_at
"2025-11-18T23:56:17.633843+00:00"
Tuesday, November 18th, 2025 at 11:56:17 PM GMT+00:00
model
"google/gemini-2.5-pro"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
2056
moderation_latency
(null)
generation_time
21259
tokens_prompt
68
tokens_completion
1180
native_tokens_prompt
67
native_tokens_completion
2267
native_tokens_completion_images
0
native_tokens_reasoning
1668
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"STOP"
usage
0.02275375
api_type
"completions"
id
"gen-1763510155-pLrUvutxRaNo48d8Ohvu"
upstream_id
(null)
total_cost
0.02275375
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Google AI Studio"
Evaluation details
Result
Evaluator
Details
Meta Data
0.0000%
Matches word count
n/a
neededClean
false
words
440
0.0000%
Dialogue to Total Word Ratio
Ratio: 49.77%, Deviation: 19.77%
neededClean
false
wordsTotal
440
wordsDialogue
219
0.0000%