NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-20-01KS25YWET0F66505Y2XRF0Z0N
Run Details
Model:
Gemini 3.5 Flash (Reasoning, Minimal)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
5.1s
Cost:
$0.006139
Provider:
Google AI Studio
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
All right, Molly, explain to me again why we are dressed like overgrown ladybugs. Because we are stealth assassins of joy, Bill. Put the antenna headband on. It ITCHES, Molly. And it’s humiliating. I have a reputation. I’m the guy who cracked the vault at the Brussels diamond exchange. Now I’m wearing red felt with black polka dots. Yeah, and you did three years for that Brussels job, didn’t you? This is different. This is low-risk, high-reward. Think of the loot. Kids' toys? What are we going to do with thirty copies of a plastic dump truck and a mountain of sticky juice boxes? You have no vision. Jimmy told me the birthday boy is the son of a tech billionaire. We aren’t stealing cheap plastic, Bill. We are stealing vintage, unopened LEGO sets, prototype gaming consoles, and apparently, a literal solid-gold rocking horse. A gold rocking horse? At a six-year-old's birthday party? Rich people are insane, Bill. Now, did you bring the distraction? I have the bubble machine in the van. It’s industrial strength. It can produce three thousand bubbles per minute. It will blind them. Perfect. The sugar-crazed toddlers will be utterly disoriented by the soapy deluge. Under the cover of the suds, we slip into the solarium where the gift table is. And then we just... bag the goods? Precisely. We load them into the giant burlap sack. The one shaped like a toy sack? The one that says "Santa’s Loot"? Molly, it's June. It’s thematic, Bill! It blends in. Who is going to question a giant ladybug carrying a sack of toys at a themed birthday party? Literally everyone, Molly. The theme of the party is "Under the Sea." We are bugs. We don't belong in the ocean. Ladybugs can swim. Probably. They absolutely cannot. Look, nobody at a six-year-old's party is checking the phylogenetic tree of the costume entertainers. They just want their kids distracted so they can drink chardonnay in peace. Now, did you memorize the escape route? Yes. Through the hedge maze, past the bouncy castle, jump the fence, and into the getaway minivan. Why did we get a minivan? Because it fits the demographic! If we park a black sedan outside a children's party, the neighborhood watch gets called. A dented minivan with a 'Baby on Board' sticker is invisible. It’s genius. I hate how much sense that makes. Exactly. Now, adjust your spots. The party starts in five minutes. What if there’s a clown? I’m not fighting a clown, Molly. If there’s a clown, you use the cake-launcher. The what? The cake-launcher. I modified a t-shirt cannon. It shoots high-velocity frosting. You are a deeply troubled woman. I’m a professional, Bill. Now, grab the bubble machine. It’s showtime.
created_at
"2026-05-20T07:52:46.302Z"
5/20/2026, 7:52:46 AM
model
"google/gemini-3.5-flash-20260519"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
760
moderation_latency
(null)
generation_time
5036
tokens_prompt
77
tokens_completion
725
native_tokens_prompt
67
native_tokens_completion
671
native_tokens_completion_images
0
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"STOP"
service_tier
"standard"
usage
0.0061395
router
(null)
provider_responses
0
endpoint_id
"bd50061c-e1d8-469e-96b1-d322e77378f8"
id
"TmgNarqUG--k1MkPiaab8Qw"
is_byok
false
latency
760
model_permaslug
"google/gemini-3.5-flash-20260519"
provider_name
"Google AI Studio"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.15.0; linux; x64))"
http_referer
(null)
request_id
"req-1779263566-mLUyJmkptmtIafASvhMG"
session_id
(null)
api_type
"completions"
id
"gen-1779263566-vsZhyhmAlxSIV1l73NSS"
upstream_id
"TmgNarqUG--k1MkPiaab8Qw"
total_cost
0.0061395
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Google AI Studio"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
457
0.00%
Dialogue to Total Word Ratio
Ratio: 2.78%, Deviation: 67.22%
neededClean
false
wordsTotal
468
wordsDialogue
13
0.0000%