Model

Gemini 2.5 Pro

Test

Dialogue tags

Scenario

Write 500 words with 50% dialogue

Run ID

2025-11-18-01KACP6W383HZREFZ12RZMPZRV

Model input

User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.

Model response

AI
Molly jabbed a greasy finger at the blueprint, a hastily sketched floor plan on the back of a pizza menu. The air in the van was thick with the smell of stale coffee and failure. Across the street, a suburban house pulsed with the chaotic energy of a sugar-fuelled riot, the muffled shrieks of children punctuated by dreadful pop music. A bouncy castle, shaped like a vaguely reptilian blob, sagged in the manicured backyard like a defeated beast. Bill sighed beside her. “Let’s go over it one last time. Phase one: Infiltration.” “Why do we have to call it that? Can’t we just say ‘going in’?” “Because we’re professionals, Bill. This is Operation Birthday Bash.” “I thought we agreed on Operation Sugar Heist.” “No, you agreed on that. Alone. In your head.” Molly slid a finger across the map, tracing their route. “I’ll create the diversion here, in the primary entertainment zone.” “You mean the living room.” “While I have the hostiles… I mean, the children… distracted, you proceed to the asset location.” Bill raised his binoculars, a deep frown creasing his forehead as he scanned the party through the grimy windshield. “You mean the present pile.” “Yes, Bill. The present pile.” He adjusted the focus, zeroing in on a man in a rainbow wig twisting a long, yellow balloon. “We have a problem. There’s an unvetted operative on the scene. Full face paint, oversized shoes. It’s a clown.” “He’s not an operative. He’s here to make balloon animals. He’s a non-issue.” “He looks shifty. Never trust a man who can fit into a car that small.” Molly let out a slow, deliberate breath, visibly restraining herself from grabbing the tire iron under her seat. She held up a fluffy white cottontail. “This is your primary concern. Your disguise.” “I still think we should have gone as pest control.” “No one hires pest control for a six-year-old’s party. We are the magic act. You are Barnaby, the Bewildered Bunny.” “Why am I always bewildered?” “It suits you. Now, your window is precisely three minutes, during my ‘endless scarf from the mouth’ trick. You grab the bags, exfiltrate through the kitchen, and rendezvous at the van. Do not, under any circumstances, engage with the cake.” “What if it’s carrot cake? That feels very on-brand for Barnaby.” “It’s not, and you’re not eating it.” He sighed, the sound of a man defeated by cruel fate and a lack of cake. “Fine. But if I get accosted by that clown, I’m using the silly string.” “There is no silly string. That was never approved.” Molly’s glare was withering. She watched Bill reluctantly pull on a pair of oversized, floppy ears, his reflection in the window revealing a truly pathetic-looking rabbit. It was a new low. “Just remember the signal. Two ear wiggles means abort.” “And one wiggle?” “Means you’ve got frosting on your whiskers and I’m leaving you behind.” He adjusted the ears, giving them a tentative flap. “Okay. Let’s go steal some kid’s birthday.”
created_at"2025-11-18T23:54:49.040612+00:00" Tuesday, November 18th, 2025 at 11:54:49 PM GMT+00:00
model"google/gemini-2.5-pro" 
app_id182717
external_user(null)
streamedtrue
cancelledfalse
latency1975
moderation_latency(null)
generation_time64632
tokens_prompt68
tokens_completion2590
native_tokens_prompt67
native_tokens_completion8109
native_tokens_completion_images0
native_tokens_reasoning7400
native_tokens_cached0
num_media_prompt(null)
num_input_audio_prompt(null)
num_media_completion0
num_search_results(null)
origin"https://ncbench.com/" 
is_byokfalse
finish_reason"stop" 
native_finish_reason"STOP" 
usage0.08117375
api_type"completions" 
id"gen-1763510022-LEcEKuR8cGcsxDiLjRI8" 
upstream_id(null)
total_cost0.08117375
cache_discount(null)
upstream_inference_cost0
provider_name"Google AI Studio" 

Evaluation details

Result Evaluator Details Meta Data
95.9868% Matches word count n/a
neededCleanfalse
words508
3.2786% Dialogue to Total Word Ratio
Ratio: 58.27%, Deviation: 8.27%
neededCleanfalse
wordsTotal508
wordsDialogue296
49.6327%