NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-20-01KS262E0W73Q83RAJSRGWNJ37
Run Details
Model:
Gemini 3.5 Flash (Reasoning, Minimal)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
4.9s
Cost:
$0.006184
Provider:
Google AI Studio
Evaluation summary
Score
Evaluator
Details Link
0.08%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0424%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Hand me the blueprints, Bill, and don't tell me you spilled juice on them again. They aren't blueprints, Molly, it's a treasure map drawn in glitter glue by the birthday boy's little sister. Does it show the perimeter defenses? If by defenses you mean a bouncy castle shaped like a dragon and a heavily armed clown named Barnaby, then yes. Don't underestimate Barnaby, Bill. Word on the street is he can fold a balloon animal into a deadly garrote in three seconds flat. He looks more like a gin-and-tonic guy to me, but fine. What's the target location for the loot? The gift table is situated in the north quadrant of the patio, right next to the artisanal juice bar. Perfect. We slip in during the cake cutting. Nobody notices anything when sugar is on the line. Except the mother, Karen. She’s vigilant, Bill. She’s been coordinating this Pinterest-perfect nightmare for six months. She has eyes in the back of her head and a clipboard that could double as a blunt weapon. We’ll use a distraction. I’ll trigger the sprinkler system. No, that ruins the merchandise. We want the gifts dry. Have you seen the wrapping paper? It’s imported foil. We need to maintain the resale value of those unopened Lego sets. Lego? I thought we were after the heavy machinery, like the motorized miniature Jeeps. Both, obviously. But the Legos are pure street currency. We pull this off, and we can retire to a place where juice boxes don't exist. So, what's my disguise? Please don't say the mascot suit. You’re the mascot, Bill. It's security-grade foam. Molly, it's eighty degrees outside! I’ll dehydrate in a giant polyester squirrel costume! You want the haul or not? The squirrel suit gives you plausible access to the patio. Children will swarm you. It’s the perfect cover while I bag the goods in the oversized diaper bin. A diaper bin? That’s disgusting. It’s genius. No one looks inside a diaper bin. It’s a psychological cloaking device. And how do we get the bin out of there? The getaway vehicle is parked by the sandbox. The minivan? The minivan. I’ve installed a fake magnetic decal on the side. We are officially 'Pete's Party Planners.' You think of everything. That's why I'm the mastermind and you're the squirrel. Now, rehearse your lines. What do you do if a toddler grabs your tail? Maintain character, squeak loudly, and gently redirect their attention to the bubble machine. Good. And if Karen questions your credentials? I hand her a business card and complain about the union rates for rodent actors. Excellent. We strike at exactly two o'clock, right after the pinata gets smashed. The children will be distracted by the rain of cheap candy. Like sharks in a feeding frenzy. Exactly. Put the head on, Bill. It’s showtime.
created_at
"2026-05-20T07:54:42.594Z"
5/20/2026, 7:54:42 AM
model
"google/gemini-3.5-flash-20260519"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
721
moderation_latency
(null)
generation_time
4772
tokens_prompt
77
tokens_completion
748
native_tokens_prompt
67
native_tokens_completion
676
native_tokens_completion_images
0
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"STOP"
service_tier
"standard"
usage
0.0061845
router
(null)
provider_responses
0
endpoint_id
"bd50061c-e1d8-469e-96b1-d322e77378f8"
id
"wmgNatCeLN-d_uMP_uqesAY"
is_byok
false
latency
721
model_permaslug
"google/gemini-3.5-flash-20260519"
provider_name
"Google AI Studio"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.15.0; linux; x64))"
http_referer
(null)
request_id
"req-1779263682-pSn7WdKDugf9MZOpKZEM"
session_id
(null)
api_type
"completions"
id
"gen-1779263682-hhbsy7YAZN0RQTq55v0O"
upstream_id
"wmgNatCeLN-d_uMP_uqesAY"
total_cost
0.0061845
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Google AI Studio"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.08%
Matches word count
n/a
neededClean
false
words
471
0.00%
Dialogue to Total Word Ratio
Ratio: 0.63%, Deviation: 69.37%
neededClean
false
wordsTotal
475
wordsDialogue
3
0.0424%