NC Bench
Overview
About
Tests
Models
Vendors
Compare
Run: 2026-02-25-01KJ94QT7WKC0EPSFW0CHB1THW
Run Details
Model:
Gemini 3 Flash (Preview, Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
54.5s
Cost:
$0.026649
Provider:
Google AI Studio
Evaluation summary
Score
Evaluator
Details Link
90.48%
Matches word count
(details)
97.06%
Dialogue to Total Word Ratio
(details)
93.7731%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Bill adjusted his rubber nose in the rearview mirror. Outside, the target loomed—a ranch decorated with neon streamers. Molly checked her tactical stopwatch, her face a mask of joyless intensity. She adjusted her turtleneck, believing it added sophisticated larceny. "The bouncy castle is at sixty percent capacity, Bill. If we don’t move when the juice boxes are distributed, our window slams shut. Does the nose look too red? I feel like red says shifty, but polka-dots say untrustworthy. I need to know if I'm a convincing clown or just a man in a costume." Bill squeezed a rubber bulb in his pocket, producing a mournful honking sound. He looked like a laundry accident in a paint factory. Molly pulled up a digital blueprint that was actually a grease-stained napkin from a diner. Her eyes narrowed as she spotted a small child holding a juice pouch. "It’s a five-year-old’s birthday, not the Met Gala. Focus on the mission. You’re the distraction. You enter the patio, drop the balloon animals, and create a bottleneck at the donkey station. Just don't get pinned yourself." Bill peered through binoculars held together by duct tape. Toddlers scrambled across the lawn like sticky-handed locusts. He checked his pockets for gear, feeling the heavy weight of the plastic whistles and the secondary distraction units. He was sweating through his white makeup. "What about Aunt Susan? She’s a high-velocity hugger. If she latches on, the operation is compromised. Also, are we leaving the goody bags? I heard they have those whistles that sound like dying birds. I want one." Molly tapped the dashboard. She was already visualizing the Lego Millennium Falcon in the trunk. It was the crown jewel of the nursery collection, and she wasn't letting a sugar-crazed toddler stand in her way. She popped a piece of caffeine gum into her mouth. "Amateur hour is over. We take the big boxes, ghost the cake service, and disappear before the professional clown arrives. You’re an Entertainment Consultant. He’s Bozzo the Disappointing. There’s a hierarchy. Just use the glitter cannons." Bill took a deep breath and opened the car door. He stepped into the sunlight. His oversized shoes hit the pavement with a heavy thud. He looked back at Molly, his wig slightly crooked from the nerves. "I'm going in. If I don't make it back, tell my mother I died doing something more dignified than stealing blocks. And remember, the handkerchief trick isn't infinite. It’s four yards of polyester. It’s a finite trick." Molly didn't watch him go. She was timing the interval between the pizza delivery and the inevitable sugar crash. She hummed a tune, her fingers drumming as the first scream erupted from the backyard. She watched a neighbor's cat flee the scene in terror. "Good luck, you magnificent fool. Just keep them away from the Millennium Falcon until I give the signal. If anyone asks for a giraffe, tell them those are for winners only. This is serious, birthday-themed business. Now get out there and make those children wonder where their childhood went."
created_at
"2026-02-25T00:56:14.337Z"
2/25/2026, 12:56:14 AM
model
"google/gemini-3-flash-preview-20251217"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1957
moderation_latency
(null)
generation_time
54506
tokens_prompt
77
tokens_completion
4594
native_tokens_prompt
67
native_tokens_completion
8872
native_tokens_completion_images
0
native_tokens_reasoning
8191
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"STOP"
usage
0.0266495
router
(null)
provider_responses
0
endpoint_id
"6e82f652-2693-4b60-bbe0-7362d6e345f0"
is_byok
false
latency
1957
model_permaslug
"google/gemini-3-flash-preview-20251217"
provider_name
"Google AI Studio"
status
200
api_type
"completions"
id
"gen-1771980974-FGQ5uwot8j04csTsk0eA"
upstream_id
(null)
total_cost
0.0266495
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Google AI Studio"
Evaluation details
Result
Evaluator
Details
Meta Data
90.48%
Matches word count
n/a
neededClean
false
words
510
97.06%
Dialogue to Total Word Ratio
Ratio: 49.23%, Deviation: 0.77%
neededClean
false
wordsTotal
518
wordsDialogue
255
93.7731%