NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-10-01KR8X2RPQAN5H3VGZJ4BNHTGA
Run Details
Model:
Gemini 3.1 Flash Lite
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
4.1s
Cost:
$0.000969
Provider:
Google AI Studio
Evaluation summary
Score
Evaluator
Details Link
95.99%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
47.9934%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The van smelled like wet cardboard and industrial floor wax. Molly tapped a rhythm on the steering wheel, her eyes fixed on the suburban mansion glowing like a beacon of suburban opulence. Are you sure about the clown, Bill? I hate clowns. They have those dead, glassy eyes that see into your soul and judge your criminal history. It is just a birthday party, Molly. The clown is a twenty-year-old college kid named Kyle. He is currently eating a burrito in the garage. He is not a threat. He has a squeaky hammer, Bill. A squeaky hammer is a weapon of psychological warfare. If he squeaks me, I might panic and tackle a bounce house. We are not tackling anything. We are extracting the birthday haul. The kid is turning seven. Do you have any idea how many high-end LEGO sets a spoiled seven-year-old in this zip code receives? It is a goldmine. Is it though? What if they are socks? What if it is a mountain of sensible, cotton socks? I am not risking a felony evasion charge for a drawer full of tube socks. They are not socks. I checked the invite list. The father owns a private equity firm. These kids get tablets, consoles, and drones. We are talking pure profit. And the cake? Please tell me there is cake. I haven't eaten since the botched jewel snatch in Dover. Priorities, Molly. We are there for the loot, not the buttercream. Everything is about the loot. You have no romance in your soul. A heist is a performance piece. It needs a climax. And that climax should involve my face covered in chocolate ganache. Fine. If we clear the presents in under four minutes, you can swipe one cupcake. But no frosting-heavy slices. It slows down your getaway gait. You are a cruel taskmaster. What is the entry point? The side gate. The parents are distracted by the magician. Did you bring the tranquilizer dart for the golden retriever? I thought the dog was a prop. It looked like a stuffed animal in the photos. It is a pedigree pug, Molly. It is viciously indifferent. It will bark, and the entire neighborhood will come running. Fine, I have the dart. But if I miss and hit the magician, I am not stopping to revive him. Fair. Are you ready? Wait, one last thing. Do I have to wear the mascot head? Because the visibility in that wolf mask is absolutely non-existent. It is a safari theme, Molly. You are a lion. You have to wear the mane. It hides your face from the nanny cams. I am a lion who steals presents. This is rock bottom, Bill. It is the peak of our career. Now, let us go ruin a toddler’s afternoon. Molly sighed, pulled the scratchy, synthetic mane over her head, and kicked the van door open. The sound of jubilant, screaming children flooded the humid night air.
created_at
"2026-05-10T12:16:35.55Z"
5/10/2026, 12:16:35 PM
model
"google/gemini-3.1-flash-lite-20260507"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
512
moderation_latency
(null)
generation_time
4032
tokens_prompt
77
tokens_completion
709
native_tokens_prompt
67
native_tokens_completion
635
native_tokens_completion_images
0
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"STOP"
service_tier
"standard"
usage
0.00096925
router
(null)
provider_responses
0
endpoint_id
"237af107-d2e9-46c7-b146-c74d75980dee"
id
"I3cAauDZLMzUz7IPv_Dj-AE"
is_byok
false
latency
512
model_permaslug
"google/gemini-3.1-flash-lite-20260507"
provider_name
"Google AI Studio"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1778415395-q0rr5G5IO1WMCn2SEjnY"
session_id
(null)
api_type
"completions"
id
"gen-1778415395-K6gTATYhC7RdPk8aUhCM"
upstream_id
"I3cAauDZLMzUz7IPv_Dj-AE"
total_cost
0.00096925
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Google AI Studio"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
95.99%
Matches word count
n/a
neededClean
false
words
492
0.00%
Dialogue to Total Word Ratio
Ratio: 0.00%, Deviation: 70.00%
neededClean
false
wordsTotal
499
wordsDialogue
0
47.9934%