NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-19-01KS0TXBCGJWN5RRA9TDJSWQ0Q
Run Details
Model:
Gemini 3.5 Flash (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
30.0s
Cost:
$0.066224
Provider:
Google AI Studio
Evaluation summary
Score
Evaluator
Details Link
0.08%
Matches word count
(details)
71.13%
Dialogue to Total Word Ratio
(details)
35.6067%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly adjusted her binoculars, her brow furrowed in deep concentration as she stared through the humid windshield. Beside her in the front seat of the rusted getaway hatchback, Bill was loudly chewing on a blue raspberry ring pop, turning his entire tongue a cartoonish neon shade. "We have a three-minute window, Bill. At precisely two o'clock, the clown begins his balloon-animal demonstration, which will draw all the parents to the patio." "Why do they get a professional clown while we only grew up with Uncle Gary in a stained bedsheet?" "Focus on the mission. Once the clown has them distracted, we breach the bounce castle." "The presents aren't in the bounce castle, Molly. They're on the gift table by the guacamole bowl." "I know that. The bounce castle is our staging area. We use the inflatable walls for cover." "What about the watchdog patrols?" "You mean Barnaby? He is a seven-month-old golden retriever who is currently wearing a paper party hat." "He looks shifty. I don't trust him." "We neutralize his canine senses with this." She reached into her leather jacket and held up a premium, grass-fed beef jerky stick with dramatic flair, letting the afternoon sunlight glint off the plastic packaging like a freshly polished blade. Bill's eyes widened in genuine awe. "Genius. What about the high-tech security laser grid?" "Bill, those are streamers. Red crepe paper streamers." "Right, creep under them. Got it. Now, what's our exit strategy?" "We bag the loot in these high-capacity, low-profile tactical sacks." "Those are pillowcases with dinosaurs printed on them." "They blend into the environment. We walk out the side gate, stash the haul in the trunk, and blend into the suburbs." "Wait, can I wear the special disguise?" "The juice-box-distribution-engineer outfit? Yes, but keep your head down." "Do I get to wear the mustaches?" "It's a fake mustache at an eight-year-old’s birthday party, Bill, but yes, wear it if it makes you feel like an international criminal mastermind." Bill adjusted his fake glasses, his chest puffing out with pride as he mentally inhabited his new undercover identity. Molly sighed and pinched the bridge of her nose. "It adds necessary gravity to the situation. My character has a backstory." "Just promise you won't eat the velvet cupcakes this time." "That was one time! And they had those little plastic rings on top! I really wanted the green spider ring!" "We are professionals. We are stealing a mountain of Lego sets, brand-new video games, and hopefully envelopes of cash from distant aunts. We do not stop for frosting." "What if the clown spots us and sounds the alarm?" "If Noodles the Clown compromises our perimeter, we deploy the smoke screen immediately." "You actually secured military-grade smoke grenades?" "I brought a block of dry ice and a bucket of warm tap water." "You really are the most brilliant mastermind in the tri-state area." "I know. Now, check your watch." Bill squinted at his plastic digital watch, which was currently flashing a picture of a cartoon dog. He adjusted his grip on a plastic juice box like it was a loaded weapon, his expression turning deadpan. "It's two o'clock." "Move out. Go, go, go."
created_at
"2026-05-19T19:20:27.286Z"
5/19/2026, 7:20:27 PM
model
"google/gemini-3.5-flash-20260519"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1549
moderation_latency
(null)
generation_time
29912
tokens_prompt
77
tokens_completion
3166
native_tokens_prompt
67
native_tokens_completion
7347
native_tokens_completion_images
0
native_tokens_reasoning
6597
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"STOP"
service_tier
"standard"
usage
0.0662235
router
(null)
provider_responses
0
endpoint_id
"bd50061c-e1d8-469e-96b1-d322e77378f8"
id
"-7cMatu1GqWujMcPxIKawAs"
is_byok
false
latency
1549
model_permaslug
"google/gemini-3.5-flash-20260519"
provider_name
"Google AI Studio"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.15.0; linux; x64))"
http_referer
(null)
request_id
"req-1779218427-YnqPOGDFi6hVDxxkbwIK"
session_id
(null)
api_type
"completions"
id
"gen-1779218427-FGcCitZvpfQ1B90ttCvf"
upstream_id
"-7cMatu1GqWujMcPxIKawAs"
total_cost
0.0662235
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Google AI Studio"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.08%
Matches word count
n/a
neededClean
false
words
529
71.13%
Dialogue to Total Word Ratio
Ratio: 72.61%, Deviation: 2.61%
neededClean
false
wordsTotal
544
wordsDialogue
395
35.6067%