NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-20-01KS263N83T6K9863GVSJEGC2F
Run Details
Model:
Gemini 3.5 Flash (Reasoning, Minimal)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
5.2s
Cost:
$0.006428
Provider:
Google AI Studio
Evaluation summary
Score
Evaluator
Details Link
97.63%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
48.8138%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly adjusted the strap of her oversized, neon-pink clown shoe and frowned at the blueprints spread across the hood of their dented sedan. The map of the local bouncy castle emporium was covered in crayon and grease stains. "We strike at three o'clock," Molly said, tapping a red circle labeled 'The Loot Zone'. "That is peak cake-induced delirium. The targets will be entirely distracted by sugar crash tantrums." Bill, currently struggling to squeeze his broad shoulders into a sequined waistcoat three sizes too small, paused to catch his breath. "Are you sure about this, Molly? It feels low, even for us. Stealing from seven-year-olds? It’s not exactly Ocean’s Eleven." "Ocean’s Eleven didn't have to pay rent, Bill," Molly snapped, pulling a rainbow-colored wig over her dark curls. "Besides, little Timmy’s parents are venture capitalists who bought him a miniature electric Porsche and three iPads. He won't notice a few missing boxes." "I just don't want to get tackled by a clown again," Bill muttered, rubbing his ribs. "That corporate gig in June still gives me nightmares." "That clown was a trained security guard, and you know it," Molly retorted. She reached into her duffel bag and pulled out a pair of custom-made, silent-running squeaky shoes. "Here. Put these on. If you squeak, the jig is up." Bill took the shoes and stared at them with deep skepticism. "Silent squeaky shoes. How does that even work structurally?" "I disabled the internal reeds with hot glue," Molly whispered, leaning in. "They are completely stealth. Now, what is the distraction plan?" "I’m the balloon animal guy," Bill said, sighing. "I've been practicing. I can do a dog, a slightly longer dog, and something that looks like an aggressive tumor." "Perfect. You draw the swarm of sticky-fingered toddlers to the patio. Once they are hypnotized by your terrible latex art, I slip into the parlor." Molly tapped the blueprint again. "The presents are always piled on the dining table. I bag the goods, toss them out the parlor window to you, and we vanish into the suburban afternoon." "What about the mom?" Bill asked, adjusting his giant red nose. "Brenda. I heard she’s a menace on the neighborhood watch app." "Brenda will be incapacitated by artisanal mimosas," Molly scoffed, putting on pairs of yellow satin gloves. "I’ve studied the target's habits. By three-fifteen, she will be crying in the pantry about her sourdough starter." "And the dad?" "He’ll be asleep on the recliner, pretending to watch the game so he doesn't have to talk to other parents." Molly checked her watch. "It’s showtime. Remember, Bill, if any kid asks, you’re Uncle Bumbles." "Uncle Bumbles," Bill repeated, his voice laced with regret. "I really need to get a new career counselor." They stepped out of the car, the scent of juice boxes and impending chaos thick in the air. Proceeding toward the backyard gate, they carried their bags of tricks, ready to pull off the ultimate juvenile heist.
created_at
"2026-05-20T07:55:22.762Z"
5/20/2026, 7:55:22 AM
model
"google/gemini-3.5-flash-20260519"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
898
moderation_latency
(null)
generation_time
5101
tokens_prompt
77
tokens_completion
778
native_tokens_prompt
67
native_tokens_completion
703
native_tokens_completion_images
0
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"STOP"
service_tier
"standard"
usage
0.0064275
router
(null)
provider_responses
0
endpoint_id
"bd50061c-e1d8-469e-96b1-d322e77378f8"
id
"6mgNao-aN8uW-8YPxdGD6Qs"
is_byok
false
latency
898
model_permaslug
"google/gemini-3.5-flash-20260519"
provider_name
"Google AI Studio"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.15.0; linux; x64))"
http_referer
(null)
request_id
"req-1779263722-IJJiuibhygDqGr3PjuD2"
session_id
(null)
api_type
"completions"
id
"gen-1779263722-oTs4kHFuKYNBkw4WReXs"
upstream_id
"6mgNao-aN8uW-8YPxdGD6Qs"
total_cost
0.0064275
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Google AI Studio"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
97.63%
Matches word count
n/a
neededClean
false
words
493
0.00%
Dialogue to Total Word Ratio
Ratio: 62.15%, Deviation: 32.15%
neededClean
false
wordsTotal
502
wordsDialogue
312
48.8138%