NC Bench
Overview
About
Tests
Models
Model
Gemini 2.5 Pro
Test
Dialogue tags
Scenario
Write 500 words with 50% dialogue
Run ID
2025-11-18-01KACP6W383HZREFZ12RZMPZRV
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly jabbed a greasy finger at the blueprint, a hastily sketched floor plan on the back of a pizza menu. The air in the van was thick with the smell of stale coffee and failure. Across the street, a suburban house pulsed with the chaotic energy of a sugar-fuelled riot, the muffled shrieks of children punctuated by dreadful pop music. A bouncy castle, shaped like a vaguely reptilian blob, sagged in the manicured backyard like a defeated beast. Bill sighed beside her. “Let’s go over it one last time. Phase one: Infiltration.” “Why do we have to call it that? Can’t we just say ‘going in’?” “Because we’re professionals, Bill. This is Operation Birthday Bash.” “I thought we agreed on Operation Sugar Heist.” “No, you agreed on that. Alone. In your head.” Molly slid a finger across the map, tracing their route. “I’ll create the diversion here, in the primary entertainment zone.” “You mean the living room.” “While I have the hostiles… I mean, the children… distracted, you proceed to the asset location.” Bill raised his binoculars, a deep frown creasing his forehead as he scanned the party through the grimy windshield. “You mean the present pile.” “Yes, Bill. The present pile.” He adjusted the focus, zeroing in on a man in a rainbow wig twisting a long, yellow balloon. “We have a problem. There’s an unvetted operative on the scene. Full face paint, oversized shoes. It’s a clown.” “He’s not an operative. He’s here to make balloon animals. He’s a non-issue.” “He looks shifty. Never trust a man who can fit into a car that small.” Molly let out a slow, deliberate breath, visibly restraining herself from grabbing the tire iron under her seat. She held up a fluffy white cottontail. “This is your primary concern. Your disguise.” “I still think we should have gone as pest control.” “No one hires pest control for a six-year-old’s party. We are the magic act. You are Barnaby, the Bewildered Bunny.” “Why am I always bewildered?” “It suits you. Now, your window is precisely three minutes, during my ‘endless scarf from the mouth’ trick. You grab the bags, exfiltrate through the kitchen, and rendezvous at the van. Do not, under any circumstances, engage with the cake.” “What if it’s carrot cake? That feels very on-brand for Barnaby.” “It’s not, and you’re not eating it.” He sighed, the sound of a man defeated by cruel fate and a lack of cake. “Fine. But if I get accosted by that clown, I’m using the silly string.” “There is no silly string. That was never approved.” Molly’s glare was withering. She watched Bill reluctantly pull on a pair of oversized, floppy ears, his reflection in the window revealing a truly pathetic-looking rabbit. It was a new low. “Just remember the signal. Two ear wiggles means abort.” “And one wiggle?” “Means you’ve got frosting on your whiskers and I’m leaving you behind.” He adjusted the ears, giving them a tentative flap. “Okay. Let’s go steal some kid’s birthday.”
created_at
"2025-11-18T23:54:49.040612+00:00"
Tuesday, November 18th, 2025 at 11:54:49 PM GMT+00:00
model
"google/gemini-2.5-pro"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1975
moderation_latency
(null)
generation_time
64632
tokens_prompt
68
tokens_completion
2590
native_tokens_prompt
67
native_tokens_completion
8109
native_tokens_completion_images
0
native_tokens_reasoning
7400
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"STOP"
usage
0.08117375
api_type
"completions"
id
"gen-1763510022-LEcEKuR8cGcsxDiLjRI8"
upstream_id
(null)
total_cost
0.08117375
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Google AI Studio"
Evaluation details
Result
Evaluator
Details
Meta Data
95.9868%
Matches word count
n/a
neededClean
false
words
508
3.2786%
Dialogue to Total Word Ratio
Ratio: 58.27%, Deviation: 8.27%
neededClean
false
wordsTotal
508
wordsDialogue
296
49.6327%