NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-20-01KS25TW17EJRDMFJ9C6NBZHKB
Run Details
Model:
Gemini 3.5 Flash (Reasoning, Minimal)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
4.9s
Cost:
$0.005789
Provider:
Google AI Studio
Evaluation summary
Score
Evaluator
Details Link
20.19%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
10.0948%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly squinted through her binoculars, her brow furrowing as she adjusted the focus on the primary target. It was a sprawling backyard draped in pastel streamers, dominated by a towering, neon-orange bounce house that vibrated with the screams of sugar-crazed eight-year-olds. We need to talk about the perimeter security, Molly said, lowering the binoculars and tapping her map of the suburban cul-de-sac. It is tighter than we anticipated. Bill, who was currently trying to squeeze his six-foot-two frame into a remarkably tight, glittery spandex jumpsuit, paused to wipe sweat from his brow. What do you mean, tighter? It is a birthday party, Molly. The oldest person security guard is Grandmama, and she has been asleep in a lawn chair since noon. Do not underestimate Grandmama, Molly warned, her voice deadpan. But more importantly, we have a visual on the prize. The gift table is situated directly beneath the patio umbrella. I estimate thirty distinct parcels. High-value targets. Lego sets, video games, maybe even some cold, hard cash in those Hallmark cards. And you are sure this disguise works? Bill asked, gesturing to his outfit. I feel like a disco ball that got shrunk in the wash. You look like a professional children's entertainer, Molly lied, squinting at him. Or at least, a magician who has hit rock bottom. The kids will not question it. Just throw some glitter if anyone gets close. It creates a tactical distraction. I hate glitter, Bill muttered, plucking a stray pink piece of tinsel from his sleeve. It gets everywhere. It is the herpes of the craft world. Why can’t we just use smoke bombs? Because smoke bombs attract the fire department, Bill. Glitter attracts children, whom we can then easily outrun. Now, remember the plan. You enter through the side gate pretending to be Captain Sparkles. You announce a balloon animal emergency in the front yard. That draws the crowd away from the backyard. And what are you doing while I am being mauled by toddlers demanding balloon poodles? Bill asked, folding his arms. I will be executing the sweep, Molly said, tapping her tactical duffel bag. I slip in from the neighbor’s hedge, bag the presents, and meet you at the extraction vehicle. We will be gone before the birthday boy even blows out his candles. What if the mother asks for my performer credentials? Bill inquired, looking genuinely panicked. Tell her you left them in your other glitter suit, Molly said, shoving a balloon pump into his hands. No more questions. It is go-time. Bill sighed, adjusting his rainbow cape. If I get tackled by a helicopter parent, I am singing like a canary. Just don't drop the balloons, Molly said, checking her watch. We have exactly four minutes before the cake is served, and after that, the sugar crash hits. We do not want to be there for the meltdown. Go!
created_at
"2026-05-20T07:50:34.797Z"
5/20/2026, 7:50:34 AM
model
"google/gemini-3.5-flash-20260519"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
688
moderation_latency
(null)
generation_time
4762
tokens_prompt
77
tokens_completion
734
native_tokens_prompt
67
native_tokens_completion
632
native_tokens_completion_images
0
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"STOP"
service_tier
"standard"
usage
0.0057885
router
(null)
provider_responses
0
endpoint_id
"bd50061c-e1d8-469e-96b1-d322e77378f8"
id
"ymcNaojPO6mwjMcPoe3xkAo"
is_byok
false
latency
688
model_permaslug
"google/gemini-3.5-flash-20260519"
provider_name
"Google AI Studio"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.15.0; linux; x64))"
http_referer
(null)
request_id
"req-1779263434-gZOedn0DqZFak6f6TVuS"
session_id
(null)
api_type
"completions"
id
"gen-1779263434-Ipjf3M1rzTD7GMUiB6M7"
upstream_id
"ymcNaojPO6mwjMcPoe3xkAo"
total_cost
0.0057885
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Google AI Studio"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
20.19%
Matches word count
n/a
neededClean
false
words
480
0.00%
Dialogue to Total Word Ratio
Ratio: 0.00%, Deviation: 30.00%
neededClean
false
wordsTotal
490
wordsDialogue
0
10.0948%