NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-10-01KR8W2CT81CGMJR2Q3WD8PWBN
Run Details
Model:
Gemini 3.1 Flash Lite (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
5.1s
Cost:
$0.000893
Provider:
Google AI Studio
Evaluation summary
Score
Evaluator
Details Link
0.49%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.2460%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The blueprints of the suburban cul-de-sac were spread across the sticky surface of the booth at the local diner. Molly traced a finger over the perimeter fence while Bill aggressively dunked a churro into his lukewarm latte. The target was a high-end birthday bash at the Miller residence, a fortress of bouncy castles and temper tantrums. According to the intel, the gift pile was located in the primary living room, specifically positioned beneath a massive, over-decorated evergreen tree. Molly leaned in, her voice dropping to a gravelly whisper. So, what is the play for the bouncy castle situation? Bill checked his watch, sighing heavily. We initiate the distraction at precisely three bells. You trigger the smoke pellets behind the garden shed, and I swoop in during the chaotic exodus. Molly shook her head, adjusting her sunglasses. No, the shed is too far from the sensory epicenter. If we want total cover, we need to pop the inflatable clown at the front gate. It will create a vacuum of tears that draws every parent away from the presents. Bill considered this, crumbling his churro into a pile of sugar dust. That is cold, even for us. But the sheer volume of screaming children would be deafening. We would be ghosts in the madness. Exactly, Molly said, tapping the map. We wear the oversized clown shoes as part of the disguise. Nobody looks down at the feet of a fellow performer, especially when a giant, rubberized nose is flapping in the wind. We blend into the chaos of the entertainment roster. Bill stared at the blueprints, his face tightening as he imagined the logistics of hauling sixty wrapped boxes through a mob of caffeinated toddlers. What about the parents holding the cameras? They are hyper-vigilant. They will be too busy fighting over the last slice of organic, gluten-free unicorn cake, Molly countered with a predatory grin. We move fast, we move low, and we exit through the side gate while the clown suit inflation sequence masks our footprints. Do you have the customized sack? The industrial-strength velvet bag is in the trunk, reinforced with double stitching and silent zippers, Bill confirmed. I even added a false bottom to mimic the weight of a haul of stuffed unicorns if we get stopped by security. Molly stood up, tossing a handful of crumpled bills onto the table. Then it is settled. We target the pile, clear the room, and vanish before the piñata ceremony begins. I despise piñatas. Bill stood as well, brushing sugar from his coat. The violence of the candy extraction is beneath us anyway. Let us go steal a mountain of plastic toys from the wealthy offspring of the suburbs. They walked out into the bright afternoon sun, two shadows preparing to strip a backyard of its joy, one birthday gift at a time.
created_at
"2026-05-10T11:58:54.801Z"
5/10/2026, 11:58:54 AM
model
"google/gemini-3.1-flash-lite-20260507"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
572
moderation_latency
(null)
generation_time
3839
tokens_prompt
77
tokens_completion
713
native_tokens_prompt
67
native_tokens_completion
584
native_tokens_completion_images
0
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"STOP"
service_tier
"standard"
usage
0.00089275
router
(null)
provider_responses
0
endpoint_id
"237af107-d2e9-46c7-b146-c74d75980dee"
id
"AHMAaveRBpKhz7IPsvqikQE"
is_byok
false
latency
572
model_permaslug
"google/gemini-3.1-flash-lite-20260507"
provider_name
"Google AI Studio"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1778414335-uf6tZqFVPe18v84Oa96d"
session_id
(null)
api_type
"completions"
id
"gen-1778414335-aEdV2LinHrbXL72iTDWo"
upstream_id
"AHMAaveRBpKhz7IPsvqikQE"
total_cost
0.00089275
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Google AI Studio"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.49%
Matches word count
n/a
neededClean
false
words
473
0.00%
Dialogue to Total Word Ratio
Ratio: 0.00%, Deviation: 30.00%
neededClean
false
wordsTotal
480
wordsDialogue
0
0.2460%