NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-03-01KJTXZ67X96X9YSM0QQ1EAQA8
Run Details
Model:
Gemini 3.1 Flash Lite (Preview)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
3.7s
Cost:
$0.000929
Provider:
Google AI Studio
Evaluation summary
Score
Evaluator
Details Link
99.74%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
49.8722%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly stared at the laminated invitation taped to the dashboard, her finger tracing the cartoon fire-breathing dragon. We are really doing this, Bill? We are targeting a room full of sugar-crazed seven-year-olds for a haul of plastic toys and cheap wrapping paper? Bill adjusted his rearview mirror, squinting at his reflection to ensure his fake mustache was not peeling. It is not about the value of the goods, Molly. It is about the principle. Have you seen the registry? This kid is getting a gold-plated remote control car. That belongs in a private collection, not under a cardboard birthday crown. But these are children, Bill. They have eyes like hawks. No, they have eyes like sugar-addicts. If we time the distraction correctly, they will be too busy fighting over the bouncy castle to notice the presents disappearing. I am still not convinced. What if the clown security is elite? I heard the agency hires former special ops for these high-end backyard affairs. Stop letting the tabloids spook you. It is a guy named Giggles who blows bubbles for fifty bucks an hour. We can handle Giggles. I suppose. Did you pack the smoke pellets? I thought we agreed on glitter bombs. Glitter is a nightmare to clean up, Molly. It screams professional chaos. I have the glitter bombs. Fine, whatever. What is the extraction plan? We secure the haul, toss the glitter, and slip out through the hedge maze. Which maze? The one made of inflatable pillars. It is flimsy protection at best. You are overthinking the security of a suburban birthday party. We go in through the side gate, bypass the petting zoo, and hit the gift table before the cake cutting. And if the parents catch us? We tell them we are the cousins. Nobody ever questions the cousins at a party this large. You are devious. I am a professional. Keep the engine running and watch for the piñata. The piñata is the signal? Yes, the moment that paper donkey hits the pavement and the candy hits the grass, we move. The psychological barrier of the sugar rush will be our greatest tactical advantage. You have put entirely too much thought into a toddler’s birthday. That is why I am the mastermind and you are the muscle. I prefer the term associate. Just keep your eyes on the balloons, Bill. I am not going to jail for a stash of Lego sets and a gift card to an arcade. We are not going to jail. We are going to be rich in novelty items. Just focus. They are starting the musical chairs music. That is our queue. Suit up, Molly. Time to make these kids learn their first bitter lesson about gravity and theft. Put on the clown nose. I refuse to wear the nose. We agreed on the nose! It adds to the cover! Fine, but if any child tries to pull it off, I am taking their tricycle. Deal. Now move.
created_at
"2026-03-03T22:44:15.753Z"
3/3/2026, 10:44:15 PM
model
"google/gemini-3.1-flash-lite-preview-20260303"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
671
moderation_latency
(null)
generation_time
3599
tokens_prompt
77
tokens_completion
692
native_tokens_prompt
67
native_tokens_completion
608
native_tokens_completion_images
0
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"STOP"
usage
0.00092875
router
(null)
provider_responses
0
endpoint_id
"e004ce39-eb0c-4474-93d5-55072e725970"
is_byok
false
latency
671
model_permaslug
"google/gemini-3.1-flash-lite-preview-20260303"
provider_name
"Google AI Studio"
status
200
api_type
"completions"
id
"gen-1772577855-oGJrcyUC96ekK0fwmrDY"
upstream_id
(null)
total_cost
0.00092875
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Google AI Studio"
Evaluation details
Result
Evaluator
Details
Meta Data
99.74%
Matches word count
n/a
neededClean
false
words
496
0.00%
Dialogue to Total Word Ratio
Ratio: 0.00%, Deviation: 70.00%
neededClean
false
wordsTotal
503
wordsDialogue
0
49.8722%