NC Bench
Overview
About
Tests
Models
Model
Gemini 2.5 Pro
Test
Dialogue tags
Scenario
Write 500 words with 70% dialogue
Run ID
2025-11-18-01KACP3RSNE1AVTQW8X9HF3V1N
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The van smelled of stale crisps and ambition. Molly flattened a hand-drawn map of a suburban backyard on the dashboard. “Okay, run it by me one more time. Synchronization is key.” “Right. Synchronization.” Bill squinted through a pair of grubby binoculars. “Is that Jimmy, the birthday boy, crying?” “Ignore the emotional state of the primary, Bill. Focus on the assets.” “The presents. Got it. But he’s really wailing. Someone must have gotten him socks. A tragedy.” “At 15:30, the cake-cutting ceremony begins. That’s our window. A ten-minute cacophony of off-key singing provides audio cover. The parental units will be distracted, filming. You create the diversion.” “The Bouncy Castle Incident,” Bill said, his voice full of reverence. “It’s not an ‘incident.’ It’s a calculated manoeuvre. You will introduce a minor, repairable puncture to the inflatable dragon’s left haunch.” “Seems a bit mean. He has kind eyes.” “Bill, there could be a limited-edition Star-Lord helmet in that pile. This is no time for sentimentality toward vinyl reptiles. While the adults are panicking over a slow leak, I secure the haul.” “And I meet you at the extraction point?” “Behind the azalea bush. Are we clear?” “Crystal. Cake, dragon, presents, bush. Wait.” “What now?” “What kind of cake is it?” “It doesn’t matter what kind of cake it is!” “It absolutely matters! If it’s a caterpillar cake, everyone’s going to be fighting over the chocolate head. Prime chaos. If it’s one of those fancy fruit ones, half the kids won’t even touch it. Less distraction. It’s basic tactical gastronomy, Mol.” Molly pinched the bridge of her nose, leaving a greasy fingerprint. “It’s a blue monster with googly eyes made of fondant.” “Ooh, good choice. The sugar crash will be epic. We’ll have a post-heist window of at least twenty minutes of pure lethargy. I might even be able to get a party bag.” “There will be no party bags! We stick to the plan. In, out, no frosting on your fingers, no casual conversation with the clown.” “He looks lonely.” “He’s paid to be lonely. Now, are you ready to execute?” “Almost.” Bill lowered the binoculars, a new kind of dread on his face. He pointed a trembling finger toward the patio. “What is it? Security? A dog?” “Worse.” Molly snatched the binoculars, her heart thumping. She scanned passed the screaming children, past the pile of brightly-wrapped boxes, and saw it. Dangling from the branch of an oak tree. A colourful, papier-mâché donkey. “Oh no,” she whispered. Bill nodded grimly. “You didn't factor in the piñata, did you?”
created_at
"2025-11-18T23:52:31.689834+00:00"
Tuesday, November 18th, 2025 at 11:52:31 PM GMT+00:00
model
"google/gemini-2.5-pro"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1886
moderation_latency
(null)
generation_time
30348
tokens_prompt
68
tokens_completion
1554
native_tokens_prompt
67
native_tokens_completion
3227
native_tokens_completion_images
0
native_tokens_reasoning
2576
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"STOP"
usage
0.03235375
api_type
"completions"
id
"gen-1763509920-mfivKCDMjnpmQpamAChC"
upstream_id
(null)
total_cost
0.03235375
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Google AI Studio"
Evaluation details
Result
Evaluator
Details
Meta Data
0.0000%
Matches word count
n/a
neededClean
false
words
434
27.0288%
Dialogue to Total Word Ratio
Ratio: 75.12%, Deviation: 5.12%
neededClean
false
wordsTotal
434
wordsDialogue
326
13.5144%