NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-10-01KR8ZN4H9RJ9QYKBFKMTZJYBP
Run Details
Model:
Gemini 3.1 Flash Lite (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
4.4s
Cost:
$0.000995
Provider:
Google AI Studio
Evaluation summary
Score
Evaluator
Details Link
98.71%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
49.3562%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The nursery was sweltering, smelling of damp carpet and desperate parental ambition. Molly huddled behind a potted silk palm, her fingers twitching against the velvet casing of her lockpick set. Bill crouched beside her, his tuxedo jacket riding up to reveal the ridiculous neon-green clown shoes he insisted added character to the heist. Are you positive the security system is just a toddler with a heavy toy truck, Bill? It is the birthday of a five-year-old heir to a frozen yogurt empire, Molly. My intel suggested a perimeter of laser-eyed nannies, but I think they are all distracted by the rogue bouncy castle incident in the foyer. Did you start that incident? I prefer to think of it as a strategic diversion of resources. Do you have the duffel bag? Obviously. I am not lugging a mountain of Legos and hoverboards through the garden in my bare arms. Are we really going through with this? Stealing from a preschooler feels like hitting a new low even for us. We are not stealing from a child, we are reclaiming assets from a pampered household. Besides, do you have any idea how much a limited edition remote-controlled dragon goes for on the black market? It’s just a toy, Bill. It is a dragon, Molly. A dragon with gold-plated fins. Now, check the hallway. Is the coast clear? The coast is clear. But I heard the clown mention something about a scheduled cake-cutting ceremony in three minutes. If we get caught in the sugar rush, we are finished. Sugar rush or not, we have three minutes. Proceed to the gift pile under the arch of balloon pillars. This is ridiculous. Why are there so many gifts? Did they invite the entire city? Wealthy parents suffer from guilt-induced extravagance. It makes our job easier. Grab the box with the ribbon that looks like a miniature disco ball. That is the dragon. I don’t want the dragon. I want the pile of gift cards. They are compact and untraceable. Fine. Prioritize the gift cards, but do not leave the dragon behind. My contact has a client who specifically requested that nightmare of plastic. I am grabbing the box, but if I trip over these rubber clown shoes, I am leaving you to the nannies. You aren’t leaving anyone, Molly. My feet are fast. I once outran a German Shepherd in these very soles. That shepherd was old and arguably blind. He was in his prime. Move move move. The balloons are rustling, Bill. Keep your head down. I am down. I am practically one with the shag rug. Reach behind that oversized teddy bear with the bowtie. The loot is right there. Got it. It’s light. Too light. Don’t question it. Just shove it in the bag and let's go. We have ninety seconds before the sugar-crazed mob descends. If the alarm goes off, I’m tackling the clown first. Agreed. He looks like the type to call his mother.
created_at
"2026-05-10T13:01:34.641Z"
5/10/2026, 1:01:34 PM
model
"google/gemini-3.1-flash-lite-20260507"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
567
moderation_latency
(null)
generation_time
4024
tokens_prompt
77
tokens_completion
741
native_tokens_prompt
67
native_tokens_completion
652
native_tokens_completion_images
0
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"STOP"
service_tier
"standard"
usage
0.00099475
router
(null)
provider_responses
0
endpoint_id
"237af107-d2e9-46c7-b146-c74d75980dee"
id
"roEAaqWVMtifz7IP3uyHiAM"
is_byok
false
latency
567
model_permaslug
"google/gemini-3.1-flash-lite-20260507"
provider_name
"Google AI Studio"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1778418094-v9SuulcnIXQyl9Xi5TPP"
session_id
(null)
api_type
"completions"
id
"gen-1778418094-2w8MVC49nwrwhGHxQurN"
upstream_id
"roEAaqWVMtifz7IP3uyHiAM"
total_cost
0.00099475
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Google AI Studio"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
98.71%
Matches word count
n/a
neededClean
false
words
494
0.00%
Dialogue to Total Word Ratio
Ratio: 0.00%, Deviation: 70.00%
neededClean
false
wordsTotal
503
wordsDialogue
0
49.3562%