NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-10-01KR92DJ939PR3114BFCHYS62N
Run Details
Model:
Gemini 3.1 Flash Lite (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
4.2s
Cost:
$0.000962
Provider:
Google AI Studio
Evaluation summary
Score
Evaluator
Details Link
90.48%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
45.2419%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly adjusted her neon pink velvet balaclava, which was technically designed for a toddler but stretched just enough to reveal her narrowed, cynical eyes. Beside her, Bill was vibrating with the kind of nervous energy usually reserved for bomb disposal experts, though he was currently holding a glitter-encrusted clipboard he had swiped from a craft store. They were crouched behind a hydrangea bush in the suburban sprawl of Willow Creek, watching the front yard of the Henderson residence. A bouncy castle shaped like a deranged purple dragon wheezed in the humid afternoon air, its plastic tongue lolling listlessly on the lawn. Do you have the floor plan, Bill? Molly hissed, tapping her combat boots against the manicured grass. We need to be surgical. These toddlers are crafty, and I suspect the clown has had formal tactical training. Bill squinted at the clipboard, which was mostly covered in crayon drawings of dinosaurs. I think the gift table is located near the primary fueling station. That is the cake zone. If we flank the dragon from the east, we can hit the presents during the communal rendition of the birthday song. Molly sighed, checking the heavy-duty duffel bag slung over her shoulder. The sheer audacity of these parents, buying luxury building blocks for a human who still struggles with the concept of solid food. It is a gross display of wealth that we are merely redistributing to the black market. Are you sure about the clown? Bill asked, his voice cracking slightly. He looked like he was vibrating at a frequency that might shatter the nearby birdbath. I saw him juggling kitchen knives earlier. They looked sharp. They are rubber, you idiot, Molly snapped, though she adjusted her belt where she kept her lock-picking kit and a pouch of artisanal glitter. Keep your eyes on the gift haul. I want the boxed sets first. Anything that rattles like high-end electronic components is our priority. Bill nodded, his hands trembling as he checked his watch. The song is starting. The parents are distracted by the sugar-high swarm. It is time for the raid. They emerged from the shrubbery like two shadows dressed in stolen party supplies, moving with the practiced grace of people who had spent their entire lives perfecting the art of the petty crime. They skirted the edge of the bouncy castle, weaving through a minefield of discarded juice boxes and sticky, half-eaten cupcakes. The music was a relentless loop of synthesized nursery rhymes that threatened to break their fragile nerves, but they held the line. Molly signaled for a halt behind a giant inflatable pillar. Remember, Bill, she whispered, grabbing his sleeve. If a child makes eye contact, you initiate the frantic dance maneuver. It confuses them long enough for us to extract. No witnesses, no mercy, and for heaven’s sake, stop shaking. I cannot help the adrenaline, Bill breathed, clutching a stray party hat. The stakes are too high. All those presents. Think of the commission. Think of the retirement fund, Molly countered, and they dove into the fray.
created_at
"2026-05-10T13:49:52.3Z"
5/10/2026, 1:49:52 PM
model
"google/gemini-3.1-flash-lite-20260507"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
679
moderation_latency
(null)
generation_time
4061
tokens_prompt
77
tokens_completion
783
native_tokens_prompt
67
native_tokens_completion
630
native_tokens_completion_images
0
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"STOP"
service_tier
"standard"
usage
0.00096175
router
(null)
provider_responses
0
endpoint_id
"237af107-d2e9-46c7-b146-c74d75980dee"
id
"AI0AapHeJYKdz7IP2IWCwAc"
is_byok
false
latency
679
model_permaslug
"google/gemini-3.1-flash-lite-20260507"
provider_name
"Google AI Studio"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1778420992-jzMWF5hsLhmZ77YpsjbY"
session_id
(null)
api_type
"completions"
id
"gen-1778420992-U8e9lonCJOV5Tz1kRCxk"
upstream_id
"AI0AapHeJYKdz7IP2IWCwAc"
total_cost
0.00096175
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Google AI Studio"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
90.48%
Matches word count
n/a
neededClean
false
words
510
0.00%
Dialogue to Total Word Ratio
Ratio: 0.00%, Deviation: 30.00%
neededClean
false
wordsTotal
516
wordsDialogue
0
45.2419%