NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-10-01KR9178FMNSHHWQJRKEGYKZHG
Run Details
Model:
Gemini 3.1 Flash Lite (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
3.8s
Cost:
$0.000911
Provider:
Google AI Studio
Evaluation summary
Score
Evaluator
Details Link
9.61%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
4.8041%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The velvet curtains of the garden shed smelled of damp mulch and forgotten shovels, a fitting stage for the most ridiculous caper of Bill’s career. Outside, the target loomed: a sprawling suburban backyard transformed into a neon nightmare of inflatable bouncy castles and hyperactive children fueled by high-fructose corn syrup. Bill adjusted his balaclava, which was currently pinching the bridge of his nose, and gestured toward the main prize. A massive mountain of glitter-covered boxes sat unguarded on a folding table near the clown. Molly checked her watch, her eyes narrowing as a toddler in a superhero cape sprinted past their hiding spot. She nudged Bill, her elbow digging into his ribs with unnecessary enthusiasm. Are you sure about this, Bill? It feels like we are lowering our standards to the level of a petting zoo. Bill shook his head, shifting his weight to avoid a patch of rogue grass. It is not about standards, Molly. It is about a high-value target hidden in plain sight. Do you have any idea how much those limited edition action figures are worth on the secondary market? Molly peered through the slats, watching a parent struggle to untangle a helium balloon from a tree. I think the parent to child ratio is tipping toward chaos, she noted, pointing at a small girl who was currently eating a handful of lawn ornaments. If we strike now, the distraction is perfect. Bill gripped his bag, his knuckles turning white under his fingerless gloves. That is the genius of it. No one expects a heist during a game of hot potato. We grab, we dash, we disappear into the hedge. Molly adjusted her goggles, which were fogging up in the humid party air. I still think we should have brought smoke bombs, she whispered. Bill rolled his eyes, the motion obscured by his mask. We are stealing toys from a five-year-old’s birthday, not storming a fortress. We do not need tactical explosives. Just follow the perimeter, snatch the haul, and leave the juice boxes alone. They started to crawl, their bodies low to the dirt, moving with the rhythmic precision of professional thieves. The lawn was uneven, cluttered with abandoned hats and spilled cake crumbs. Molly navigated around a rogue soccer ball, her eyes fixed on the shimmering mountain of loot. One more yard, she hissed, her pulse hammering against her throat. Bill nodded, reaching for his pry bar, though he knew it was utterly useless against cardboard. Stay low. Once we hit that table, keep your back to the bouncy house. The bouncy house is a lawless zone of kinetic energy. They crept forward, shadows against the encroaching evening, ready to commit the most brazen act of petty larceny since the Great Cupcake Robbery of 2012. The kids were mid-screech, distracted by the clown, providing the perfect cover for the ultimate toy heist.
created_at
"2026-05-10T13:28:57.084Z"
5/10/2026, 1:28:57 PM
model
"google/gemini-3.1-flash-lite-20260507"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
867
moderation_latency
(null)
generation_time
3721
tokens_prompt
77
tokens_completion
713
native_tokens_prompt
67
native_tokens_completion
596
native_tokens_completion_images
0
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"STOP"
service_tier
"standard"
usage
0.00091075
router
(null)
provider_responses
0
endpoint_id
"237af107-d2e9-46c7-b146-c74d75980dee"
id
"GYgAao-fEa_5qtsP7LCmwAE"
is_byok
false
latency
867
model_permaslug
"google/gemini-3.1-flash-lite-20260507"
provider_name
"Google AI Studio"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1778419737-DBpyLKnACCZBjN0xUvpd"
session_id
(null)
api_type
"completions"
id
"gen-1778419737-3kCQXCm45qtudc5q3FmT"
upstream_id
"GYgAao-fEa_5qtsP7LCmwAE"
total_cost
0.00091075
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Google AI Studio"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
9.61%
Matches word count
n/a
neededClean
false
words
478
0.00%
Dialogue to Total Word Ratio
Ratio: 0.00%, Deviation: 30.00%
neededClean
false
wordsTotal
484
wordsDialogue
0
4.8041%