NC Bench
Overview
About
Tests
Models
Vendors
Compare
Run: 2026-02-25-01KJ951CQT51WM9DB046HA1GJA
Run Details
Model:
Gemini 3 Flash (Preview, Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
33.2s
Cost:
$0.016824
Provider:
Google AI Studio
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Pass me the thermal blueprints, Bill. Molly adjusted her binoculars, her eyes fixed on the suburban backyard. It’s a hand-drawn map on a Burger King napkin, Molly. And the target isn’t a vault. It’s a bouncy castle shaped like a psychedelic giraffe. Bill sighed, adjusting the neck of his costume. The geometry of an inflatable structure is complex. One wrong move and we’re swallowed by yellow vinyl. Did you scout the perimeter? I did two laps around the hibiscus bushes. There’s a high concentration of juice boxes near the swing set. Security is light. Just a clown named Puddles who looks like he’s having an existential crisis. Puddles is a professional, Molly whispered, her voice tight with pseudo-intensity. I saw him fold a balloon into a motorcycle in under twenty seconds. A man with that kind of manual dexterity is a danger to us both. He’s currently stuck in a lawn chair. I don't think he's a threat. That’s exactly what he wants you to think. It’s a classic low-profile stakeout. Status on the loot? The haul is massive. We’re looking at twelve medium-sized boxes, three oversized bags, and something that looks suspiciously like a mountain bike. The bike is the priority. It’s the crown jewel of the seven-year-old market. If we get that onto the getaway trike, we’re golden. The trike’s weight limit is forty pounds, Molly. We’re going to pop a tire. Then we use the wagon. I’ve muffled the wheels with duct tape and stolen gym socks. It’s a ghost in the grass, Bill. A silent, red-painted ghost. I’m still not sure about these disguises. Why am I the rear end of a horse? It’s a costume party. If we go in wearing balaclavas, we’ll trigger the mothers. Nobody suspects a horse. Especially one that’s interested in the gift table. I can’t see anything out of this tail-hole, Bill grumbled, fumbling with the plush fabric. You don’t need to see. You just need to haul. When I give the signal—which is a high-pitched imitation of a hawk—you move in. Why a hawk? Why can’t the signal just be you saying go? Because go is common. A hawk is unexpected. It creates a moment of primal confusion in the toddlers, giving us a four-second window to strike. I looked at the guest list. There’s a kid named Tyler who’s a known biter. What’s the protocol? If Tyler engages, you deploy the distraction. I’ve packed organic, gluten-free kale chips. He’ll be so offended by the lack of sugar he’ll be stunned into silence. That’s cold, Molly. Even for us. This is a high-stakes environment. It’s not about being liked. It’s about the plastic. Check your watch. The big hand is on the pizza slice, and the little hand is on the dinosaur. It’s nearly showtime. The cake is being served. The sugar peak is imminent. Once the birthday song hits the chorus, we breach the fence. Do you think they’ll have ice cream cake? I’d risk a lot for some Mint Chocolate Chip. Eyes on the prize. We’re here for the Lego, not the dairy. Do you have the tactical party poppers? Loaded and ready. On your mark. Let’s go. And remember, if a kid tries to hug you, tell them you’re a wild animal. Do not break character. I’m a stallion, Molly. A thieving, majestic stallion.
created_at
"2026-02-25T01:01:28.193Z"
2/25/2026, 1:01:28 AM
model
"google/gemini-3-flash-preview-20251217"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1683
moderation_latency
(null)
generation_time
33205
tokens_prompt
77
tokens_completion
2706
native_tokens_prompt
67
native_tokens_completion
5597
native_tokens_completion_images
0
native_tokens_reasoning
4784
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"STOP"
usage
0.0168245
router
(null)
provider_responses
0
endpoint_id
"6e82f652-2693-4b60-bbe0-7362d6e345f0"
is_byok
false
latency
1683
model_permaslug
"google/gemini-3-flash-preview-20251217"
provider_name
"Google AI Studio"
status
200
api_type
"completions"
id
"gen-1771981288-3dt0ahKdIlAFV1JaCGq8"
upstream_id
(null)
total_cost
0.0168245
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Google AI Studio"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
560
0.00%
Dialogue to Total Word Ratio
Ratio: 0.00%, Deviation: 70.00%
neededClean
false
wordsTotal
571
wordsDialogue
0
0.0000%