NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-19-01KS0TZFGY0SXF729T9MX5KEAZ
Run Details
Model:
Gemini 3.5 Flash (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
59.4s
Cost:
$0.144119
Provider:
Google AI Studio
Evaluation summary
Score
Evaluator
Details Link
1.04%
Matches word count
(details)
9.87%
Dialogue to Total Word Ratio
(details)
5.4507%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly taped a crayon drawn blueprint to the dusty van window, her eyes narrowing as she pointed a stubby pencil at a crudely drawn bouncy castle surrounded by stick figures. "Look, Bill, we have to get in and out before they cut the cake. Once sugar rush sets in, it is pure anarchy." "Are we talking the eight year olds or the parents?" Bill asked, adjusting his squeaky red clown nose. "Both," Molly said, tapping the glass. "The perimeter is guarded by a golden retriever and three moms holding expensive chardonnays. Our target is the trestle table by the trampoline." "I still do not get why we cannot just dress as magicians," Bill said, gesturing to the back. "I bought a rabbit and everything." "The rabbit is a liability, Bill," Molly sighed. "We are going with the classic balloon artist route. Nobody questions balloon artists. They are invisible." "But I only know how to make a snake," Bill protested. "Then you make sixty snakes, Bill. Now, the heavy artillery is in the wrapped boxes. Specifically, the giant one with the shiny silver bow." "Do you think it is a trampoline?" Bill asked, his eyes wide with childish hope. "Or a puppy? I hope it is a puppy." "It is a vintage Lego kingdom set, Bill. It is worth four hundred bucks on the black market. We grab that, the Nintendo Switch boxes, and any cash envelopes from grandmas." "Grandma cash is pure profit," Bill agreed, nodding solemnly as he checked his gear. "What about the bouncy castle?" "What about it?" Molly asked. "Can we jump in it? Just for five minutes?" "We are professional thieves, Bill, not tourists," Molly snapped, glaring at him through her oversized pink sunglasses. "You jump in the bouncy castle, you pop the seams, you compromise the extraction point. Besides, we have a tight schedule." "It is just that my inner child needs this, Molly. It has slide attachments." "Your inner child needs to focus on the pony," Molly countered. "There is a pony?" "Yes, and if it sees us, it will neigh. You must neutralize the pony." "With what? A tranquilizer dart?" "Give it an apple, Bill. I put three apples in your tactical satchel." "These are organic," Bill noted, peering into the canvas bag. "Only the best for our equestrian nemesis," Molly said, checking her watch. "Now, timing. At exactly two fifteen, the pinata is scheduled to hang. When the kids are blindfolded and wielding wooden bats, that is our diversion." "Because of the chaos?" "Because blindfolded children with bludgeons are a natural hazard. Everyone will be looking at the kid swinging wild. We slip behind the juice box station, bag the loot, and exit through the lilac bushes." "What if someone asks for a balloon sword?" Bill asked. "You make a snake and call it a sleeping sword," Molly said, sliding the van door open. "Are you ready?" "I am ready," Bill said, adjusting the straps on his overalls and pulling his neon green wig over his ears. "For honor, for glory, and for the legos." "Just do not touch the cake, Bill." "I absolutely cannot promise that, Molly."
created_at
"2026-05-19T19:21:37.059Z"
5/19/2026, 7:21:37 PM
model
"google/gemini-3.5-flash-20260519"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1596
moderation_latency
(null)
generation_time
59288
tokens_prompt
77
tokens_completion
4759
native_tokens_prompt
67
native_tokens_completion
16002
native_tokens_completion_images
0
native_tokens_reasoning
15282
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"STOP"
service_tier
"standard"
usage
0.1441185
router
(null)
provider_responses
0
endpoint_id
"bd50061c-e1d8-469e-96b1-d322e77378f8"
id
"QbgMarzKDrWtz7IPxI7LKQ"
is_byok
false
latency
1596
model_permaslug
"google/gemini-3.5-flash-20260519"
provider_name
"Google AI Studio"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.15.0; linux; x64))"
http_referer
(null)
request_id
"req-1779218497-SvgvPiYrqFWeorz6tnaE"
session_id
(null)
api_type
"completions"
id
"gen-1779218497-V87pYEZQ7LOCmXOjILKH"
upstream_id
"QbgMarzKDrWtz7IPxI7LKQ"
total_cost
0.1441185
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Google AI Studio"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
1.04%
Matches word count
n/a
neededClean
false
words
526
9.87%
Dialogue to Total Word Ratio
Ratio: 76.81%, Deviation: 6.81%
neededClean
false
wordsTotal
526
wordsDialogue
404
5.4507%