NC Bench
Overview
About
Tests
Models
Vendors
Compare
Run: 2026-02-25-01KJ95Y5PRREHS6ZPN7J71HKFQ
Run Details
Model:
Gemini 3 Flash (Preview, Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
22.7s
Cost:
$0.009669
Provider:
Google AI Studio
Evaluation summary
Score
Evaluator
Details Link
100.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
49.9995%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The van smelled like stale coffee and desperation. Molly peered through her high-powered binoculars at the colorful streamers fluttering across the street. The target is in sight, Molly whispered. Seven-year-old male. Answers to the name Toby. Currently distracted by a Tier-One entertainment specialist. Bill looked over her shoulder, chewing on a lukewarm piece of pepperoni pizza. You mean the guy in the oversized shoes making balloon giraffes? He looks like he is about to have a heat stroke. His physical condition is irrelevant, Molly snapped. What matters is the distraction window. We have exactly twelve minutes before the pinata is breached. Once the candy hits the grass, it is a swarm. Pure chaos. We won't be able to move through the yard. I still don't get why we need the grappling hook, Bill said, gesturing to the heavy metal claw sitting on his lap. The gate isn't even locked. I checked it this morning while I was pretending to be a jogger. Style points matter, Bill. Plus, we need to maintain a vertical advantage. Now, did you secure the floor plan? Bill handed her a crumpled flyer for the party. It is a backyard, Molly. There is a patio, a swing set, and a table piled with boxes wrapped in bright paper. It is not the Louvre. To the untrained eye, perhaps, Molly said, her eyes narrowing. But I see a minefield of discarded juice boxes and potential witnesses in Velcro sneakers. Did you pack the acoustic dampeners? The bubble wrap? I told you to call them dampeners. They are for wrapping the noisy plastic toys. If Toby’s grandmother hears a fire truck siren going off mid-heist, the local authorities will be on us in seconds. The local authorities are Toby’s dad, who is currently failing to light the grill. I think we are safe. Molly checked her watch, tapping the glass with a gloved finger. Complacency is the mother of failure. We go in through the azaleas, bypass the bouncy castle, and hit the gift table. I want every Marvel action figure and that high-end drone his uncle bragged about on Facebook. What about the giant teddy bear? Bill asked. It looks soft. No fluff, Bill. We have been over this. High-value, low-volume. We are not running a petting zoo. It is just so large. It would make a great cushion for the getaway. Focus! Is the extraction vehicle ready? Bill looked at the dented sedan. The engine is idling. I put a fake license plate over the real one. It says Best Mom. Brilliant. No one suspects a soccer mom. Now, on my mark. Wait, I forgot my mask. You are wearing it, Bill. It is on your forehead. Bill pulled the neon green balaclava down over his eyes. Right. Everything is dark now. Very professional. Lift it up an inch! We have three minutes. Toby is reaching for the bat. The pinata’s days are numbered. I am ready. For the toys. For the score, Bill. For the score.
created_at
"2026-02-25T01:17:11.268Z"
2/25/2026, 1:17:11 AM
model
"google/gemini-3-flash-preview-20251217"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1779
moderation_latency
(null)
generation_time
22387
tokens_prompt
77
tokens_completion
1929
native_tokens_prompt
67
native_tokens_completion
3212
native_tokens_completion_images
0
native_tokens_reasoning
2544
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"STOP"
usage
0.0096695
router
(null)
provider_responses
0
endpoint_id
"6e82f652-2693-4b60-bbe0-7362d6e345f0"
is_byok
false
latency
1779
model_permaslug
"google/gemini-3-flash-preview-20251217"
provider_name
"Google AI Studio"
status
200
api_type
"completions"
id
"gen-1771982231-FyeZiVpeB8uGt28A4LAt"
upstream_id
(null)
total_cost
0.0096695
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Google AI Studio"
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Matches word count
n/a
neededClean
false
words
501
0.00%
Dialogue to Total Word Ratio
Ratio: 0.00%, Deviation: 70.00%
neededClean
false
wordsTotal
509
wordsDialogue
0
49.9995%