NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-10-01KR91NYGMRDTSJC01T7DVDPBR
Run Details
Model:
Gemini 3.1 Flash Lite (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
3.9s
Cost:
$0.000894
Provider:
Google AI Studio
Evaluation summary
Score
Evaluator
Details Link
14.30%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
7.1507%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The suburban lawn was a verdant nightmare of inflatable castles and aggressively cheerful bunting. Molly peered through the thick foliage of a rhododendron bush, her binoculars tracking the trajectory of a particularly unstable-looking clown. Beside her, Bill was busy double-checking the tension on his rappelling harness, his breath hitching slightly every time a toddler shrieked in delight. The sheer volume of high-pitched energy radiating from the backyard was enough to make any professional criminal reconsider their life choices, but the pile of presents sitting unguarded near the patio table was a siren song for the greedy. Do you see the one wrapped in the holographic silver paper, Bill whispered, his voice straining against the relentless thumping of a bass-heavy version of baby shark. That has to be a gaming console. We could get three hundred for that at the pawn shop before the parents even finish slicing the cake. Molly adjusted her black tactical turtleneck, which clung to her frame with the stifling embrace of a wetsuit. You are focusing on the shiny bait, Bill. Look at the size of the box beneath the pile. That is clearly a premium LEGO set. The resale value on those is practically gold standard. We grab the big one first, then we vanish into the hedgerow before the magician starts pulling rabbits out of hats. Bill squinted through his own set of lenses, his brow furrowed in deep concentration. I still think the silver one is the target. What if that box is just empty display packaging rigged up to bait greedy thieves like us. It looks too suspiciously placed. Molly sighed, resting her forehead against the damp bark of the bush. Your paranoia is legendary, but in this instance, it is misplaced. It is a four-year-old birthday party, not the vault of the Louvre. The security consists of one exhausted mother nursing a glass of lukewarm Chardonnay and a dozen children hopped up on enough processed sugar to power a small village. Now, are we going to do this, or are you going to stand there analyzing the tactical placement of gift wrapping paper until we are forced to retire in shame. I just think a little caution is warranted, Bill muttered, shifting his gear. If we get swarmed by a pack of juice-box-wielding seven-year-olds, our reputation is toast. I am not getting tackled by a kid in a superhero cape. Molly grinned, revealing teeth that were too white for a woman living in the shadows. Then stay fast and stay quiet. When I give the signal, we move on the patio, bypass the inflatable obstacle course, and make off with the stash. If anyone sees us, just roar like a dragon. It usually scares them enough for us to make a clean getaway. Bill nodded, his dread eclipsed by the potential payout. Right. Dragons. Let us go earn our keep.
created_at
"2026-05-10T13:36:58.395Z"
5/10/2026, 1:36:58 PM
model
"google/gemini-3.1-flash-lite-20260507"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
534
moderation_latency
(null)
generation_time
3784
tokens_prompt
77
tokens_completion
710
native_tokens_prompt
67
native_tokens_completion
585
native_tokens_completion_images
0
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"STOP"
service_tier
"standard"
usage
0.00089425
router
(null)
provider_responses
0
endpoint_id
"237af107-d2e9-46c7-b146-c74d75980dee"
id
"-okAavGIJJbWz7IPlLf0-QE"
is_byok
false
latency
534
model_permaslug
"google/gemini-3.1-flash-lite-20260507"
provider_name
"Google AI Studio"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1778420218-BbF9a7wcI7Dp6p5wogE5"
session_id
(null)
api_type
"completions"
id
"gen-1778420218-YTqATYN9Fq0gBhCvV853"
upstream_id
"-okAavGIJJbWz7IPlLf0-QE"
total_cost
0.00089425
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Google AI Studio"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
14.30%
Matches word count
n/a
neededClean
false
words
479
0.00%
Dialogue to Total Word Ratio
Ratio: 0.00%, Deviation: 30.00%
neededClean
false
wordsTotal
489
wordsDialogue
0
7.1507%