NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-03-01KJTY2PJMY8BWVTSSJNQDY8T9
Run Details
Model:
Gemini 3.1 Flash Lite (Preview)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
4.0s
Cost:
$0.001002
Provider:
Google AI Studio
Evaluation summary
Score
Evaluator
Details Link
93.65%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
46.8248%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly stared at the neon-splattered invitation, her brow furrowed with genuine concern. Bill, meanwhile, was busy calibrating a walkie-talkie he had salvaged from a literal dumpster. Is this really happening, Bill? We are targeting a six-year-old? It is a milestone birthday, Molly. Every milestone means high-end loot. Do you have any idea how much a pristine, limited-edition space shuttle set goes for on the open market? I have a moral compass, Bill. It points away from bouncy castles. Your compass is broken. We are experts. We steal from the elite. These parents are tech moguls. Their gifts are worth more than my sedan. It is a circus-themed backyard extravaganza. There will be clowns. I hate clowns. Clowns are merely men in greasepaint. You can handle one clown. Focus on the mountain of gift-wrapped gold in the center of the lawn. What is the entry strategy? You cannot jump a fence in tactical gear without looking like a deranged security guard. We go as party performers. I have two sets of overalls and a massive bag of balloons. Balloons? You expect me to fold air into shapes while you stuff a remote-controlled car into your trousers? It is the perfect cover. Nobody questions the guy twisting a poodle shape while he scans the perimeter for alarm sensors. And what about the cake? I want the cake. We are here for the presents, not the confectionery. Stay focused. You are no fun, Bill. A heist without a snack is just a job. Fine, steal a cupcake. But if you get frosting on the laser-tripwire mesh, I am leaving you behind. The tripwire mesh is a fairy-light display, remember? This is a children's party, not the Louvre. The dad put up a high-security perimeter to track the magician. I checked the blueprints. You checked the blueprints for a kid’s birthday? I am a professional. I checked the lawn drainage and the backyard gate hinges. We enter from the hedge row at fifteen hundred hours. Exactly when the piñata explodes. The chaos will mask our movements. Precisely. You distract the parents with your balloon dog routine, and I initiate the bag-and-grab. What if the birthday boy catches me? Tell him you are a government inspector. Kids believe anything if you say it with enough authority. I am essentially going to be a thief-clown. My mother would be so proud. Your mother is currently serving time for a bank heist in Reno. She would be thrilled you are following in her footsteps. That is a fair point. Do we bring the grappling hooks? No. There are no rooftops. Just a jungle gym and a desperate need for a getaway van. I am taking the space shuttle set. I have always wanted to build a rocket. Just make sure there is room in the bag for the rest of it. We are cleaning them out, Molly. Every single box. Fine. Let us go ruin a child’s afternoon.
created_at
"2026-03-03T22:46:10.784Z"
3/3/2026, 10:46:10 PM
model
"google/gemini-3.1-flash-lite-preview-20260303"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
684
moderation_latency
(null)
generation_time
3791
tokens_prompt
77
tokens_completion
750
native_tokens_prompt
67
native_tokens_completion
657
native_tokens_completion_images
0
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"STOP"
usage
0.00100225
router
(null)
provider_responses
0
endpoint_id
"e004ce39-eb0c-4474-93d5-55072e725970"
is_byok
false
latency
684
model_permaslug
"google/gemini-3.1-flash-lite-preview-20260303"
provider_name
"Google AI Studio"
status
200
api_type
"completions"
id
"gen-1772577970-EjQZykenFiSTL7QIskOS"
upstream_id
(null)
total_cost
0.00100225
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Google AI Studio"
Evaluation details
Result
Evaluator
Details
Meta Data
93.65%
Matches word count
n/a
neededClean
false
words
491
0.00%
Dialogue to Total Word Ratio
Ratio: 0.00%, Deviation: 70.00%
neededClean
false
wordsTotal
506
wordsDialogue
0
46.8248%