NC Bench
Overview
About
Tests
Models
Vendors
Compare
Run: 2026-02-25-01KJ95PT7523SMP7ES72820QM7
Run Details
Model:
Gemini 3 Flash (Preview, Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
20.7s
Cost:
$0.009156
Provider:
Google AI Studio
Evaluation summary
Score
Evaluator
Details Link
35.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
17.5011%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Pass me the binoculars, Bill. Not the toy ones. The tactical ones. They’re the same thing, Molly. I just painted these black so they wouldn’t glint in the sun. I can still see the Fisher-Price logo under the spray paint. It adds character. Besides, the magnification is surprisingly good for something intended for ages three and up. Look at that juice box situation at three o'clock. It’s a sticky-fingered disaster zone. Focus, Bill. We aren't here for the Capri Suns. Do you see the objective? You mean the mountainous pile of brightly wrapped loot guarded by a man in a polyester dog suit? That’s Barnaby the Beagle. He’s our primary obstacle. He’s a teenager named Kevin who gets paid in pizza and self-loathing. I can take him. Don't underestimate him. Kevin has 20/20 vision and a desperate need to impress the birthday girl’s older sister. He’s a wildcard. I’ll distract him with a fake delivery. Or a hornet’s nest. No biological weapons, Bill. We discussed this. We go in clean, we come out clean. Clean? Molly, there's a bubble machine running at maximum capacity. Within ten minutes, the entire lawn will be a slip-and-slide of industrial-grade soap. We’re going to look like we’ve been through a car wash. That’s why we’re wearing the grip-sole boots. Now, what’s the status of the secondary threat? The toddlers? They’ve entered the sugar-rush phase. It’s chaotic. One of them just tried to eat a napkin. Another is crying because his balloon is looking at him funny. The window of opportunity is opening. Excellent. When the cake comes out, the parents will gravitate toward the patio like moths to a flame. That’s when we breach the hedge. What about the dog? Not the guy in the suit, the actual golden retriever. The beast? He’s a professional. He’ll bark at anything with a pulse. I brought the high-yield bacon strips. He’ll be my best friend for exactly forty-five seconds. That’s all the time I need. I’ll slide behind the bouncy castle, neutralize the gift table, and bag the high-value targets. Are we taking the LEGO Death Star? Please tell me we’re taking the Death Star. It’s top of the manifest, Bill. Along with the motorized Jeep. How are we getting a motorized Jeep over a six-foot cedar fence? The winch, Bill. Why do you think I made you install a winch on the van? I thought that was for when you get stuck in the Starbucks drive-thru again. That was once, and those curbs are deceptively high. Are you ready? Mask on? Check. Bacon ready? Check. Slight sense of moral ambiguity? Check. Wait, which mask did you bring? The sparkly one with the trunk. It’s a festive elephant. Bill, we’re supposed to be low-profile. It’s a safari-themed party, Molly! I’m blending in! Just get behind the hedge. On my signal. Which is? The sound of a kazoo. It’s the only thing that won’t sound out of place. You’re a genius, Molly. A terrifying, gift-stealing genius. I know. Now, let’s go make a six-year-old’s birthday very, very educational regarding the concepts of loss and security.
created_at
"2026-02-25T01:13:10.122Z"
2/25/2026, 1:13:10 AM
model
"google/gemini-3-flash-preview-20251217"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1655
moderation_latency
(null)
generation_time
20646
tokens_prompt
77
tokens_completion
1771
native_tokens_prompt
67
native_tokens_completion
3041
native_tokens_completion_images
0
native_tokens_reasoning
2260
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"STOP"
usage
0.0091565
router
(null)
provider_responses
0
endpoint_id
"6e82f652-2693-4b60-bbe0-7362d6e345f0"
is_byok
false
latency
1653
model_permaslug
"google/gemini-3-flash-preview-20251217"
provider_name
"Google AI Studio"
status
200
api_type
"completions"
id
"gen-1771981990-MGuwEJzioExKWT84iC7v"
upstream_id
(null)
total_cost
0.0091565
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Google AI Studio"
Evaluation details
Result
Evaluator
Details
Meta Data
35.00%
Matches word count
n/a
neededClean
false
words
518
0.00%
Dialogue to Total Word Ratio
Ratio: 0.00%, Deviation: 70.00%
neededClean
false
wordsTotal
536
wordsDialogue
0
17.5011%