NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-20-01KS25XWRYYNR6100C5M7E6K9S
Run Details
Model:
Gemini 3.5 Flash (Reasoning, Minimal)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
5.4s
Cost:
$0.006472
Provider:
Google AI Studio
Evaluation summary
Score
Evaluator
Details Link
100.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
49.9995%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
We need the blue wrapping paper, Bill, not the shiny red. Shiny red screams amateur hour. The red has Santa Claus on it, Molly. It's camouflage. It is July. Santa Claus is a security risk. If a seven-year-old sees Santa in July, they ask questions. Questions lead to parents, and parents lead to us getting tackled by a guy named Greg who brews his own IPA. We use the blue. Fine, blue it is. What’s the entry point? The bounce house. You want us to go through the inflatable castle? It’s the perfect blind spot. The parents are all huddled around the patio cooler drinking cheap pinot grigio and regretting their life choices. The kids are inside bouncing themselves into a collective concussion. We slip in through the emergency deflation valve in the back. Is there a deflation valve on a bounce house? There is if you have a pocketknife and a dream. Once we’re inside, we blend. Molly, we’re forty. We’re not blending into a bounce house. We wear the pointy cone hats, Bill. Kids are stupid. They have terrible peripheral vision. We just bounce our way toward the gift table. It’s on the deck. That’s fifteen feet of open lawn. Which is why we use the distraction. What distraction? I hired a guy named Barnaby. He’s going to arrive at precisely two fifteen wearing a poorly fitting Spider-Man suit. He has been instructed to do a flip and fail miserably. Is he going to hurt himself? Ideally, yes. A fractured collarbone will draw one hundred percent of the parental gaze. That’s our window. We grab the loot, bag it, and go. What’s even in these presents? Lego sets? Lego sets are liquid gold, Bill. Do you know the street value of an unopened Harry Potter Hogwarts Castle? We can retire to Costa Rica on three of those. And the birthday boy, little Timmy, his grandparents are loaded. We’re talking iPads, motorized miniature sports cars, maybe a high-end drone. I’m of a mind that we’re crossing a line here, Molly. Stealing from a child? He’s turning six, Bill. He doesn’t know the value of a dollar. He’ll be just as happy with a cardboard box and some bubble wrap. We are practically teaching him a lesson about the fleeting nature of material wealth. We’re educators. We’re criminals. It’s a gray area. Did you prep the getaway vehicle? Yes, but the maternal aesthetic is ruining my reputation. It’s a Honda Odyssey, Bill. No cop pulls over a minvan with a ‘My Child is an Honor Student’ bumper sticker. It’s invisibility cloaking. It smells like stale Cheerios inside. It builds character. Now, check your watch. It’s one fifty. Perfect. Barnaby should be stretching his hamstrings. Remember, hands off the juice boxes. We are professionals. Can I at least have a slice of cake if they cut it? Only if it’s ice cream cake, Bill. We aren't savages. Now, put on your party hat and let's go make some magic happen.
created_at
"2026-05-20T07:52:13.858Z"
5/20/2026, 7:52:13 AM
model
"google/gemini-3.5-flash-20260519"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
768
moderation_latency
(null)
generation_time
5330
tokens_prompt
77
tokens_completion
779
native_tokens_prompt
67
native_tokens_completion
708
native_tokens_completion_images
0
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"STOP"
service_tier
"standard"
usage
0.0064725
router
(null)
provider_responses
0
endpoint_id
"bd50061c-e1d8-469e-96b1-d322e77378f8"
id
"LWgNavbRPJa__uMP29DIqQE"
is_byok
false
latency
768
model_permaslug
"google/gemini-3.5-flash-20260519"
provider_name
"Google AI Studio"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.15.0; linux; x64))"
http_referer
(null)
request_id
"req-1779263533-Ab0N5pPPbxJQlYNdld2k"
session_id
(null)
api_type
"completions"
id
"gen-1779263533-x3gDUNL16hLQWz6ppCKO"
upstream_id
"LWgNavbRPJa__uMP29DIqQE"
total_cost
0.0064725
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Google AI Studio"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Matches word count
n/a
neededClean
false
words
499
0.00%
Dialogue to Total Word Ratio
Ratio: 1.59%, Deviation: 68.41%
neededClean
false
wordsTotal
503
wordsDialogue
8
49.9995%