NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-19-01KS0V68JY6SHHCHZ8QBQ6Q9B6
Run Details
Model:
Gemini 3.5 Flash (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
1m 28s
Cost:
$0.212672
Provider:
Google AI Studio
Evaluation summary
Score
Evaluator
Details Link
99.98%
Matches word count
(details)
94.38%
Dialogue to Total Word Ratio
(details)
97.1813%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly slowly adjusted the plastic binoculars, her knuckles turning white. Outside the fogged windows of their dented transit van parked near the curb, the target loomed: a suburban split level draped in rainbow streamers and guarded by three helium minions. The enemy territory was crawling with hostile pint sized forces, hopped up on ice blue frosting and pure adrenaline. On the dashboard lay a blueprint of the backyard, rendered entirely in glitter gel pen on a greasy napkin, depicting the strategic placement of the bounce house and the main target, the gift table. "We have a very tight three minute window when they cut the cake," she whispered, her voice tight with professional gravity. "Are we absolutely sure about the cake cutting timing, Molly?" Bill asked. Bill was currently trying to squeeze his six foot four frame into a neon yellow safety vest that clearly belonged to a much smaller municipal worker. He was chewing on a plastic whistle, his brows furrowed in deep troubled thought. "Jimmy's mom is a perfectionist. She will slice that red velvet with surgical precision at exactly two forty. That is when the target area is completely unguarded," she hissed, pointing at her watch. "What about the clown? He looks incredibly shifty, like he has definitely seen some dark things," Bill muttered, squinting through a gap in the dirty window curtains. She snatched the whistle from his mouth. The clown was indeed a wildcard, currently twisting a long red balloon into what vaguely resembled a mutated dachshund. Molly consulted her checklist, crossing out "poodle distraction" and writing "balloon pop" in its place. She tapped the diagram of the patio. "The clown is a contractor. If he interferes, deploy the crackers. It will induce a toddler feeding frenzy," she explained coolly, tapping the paper. "And what about the main, gloriously wrapped dinosaur payload?" Bill whispered, his eyes wide. Molly tapped a photograph she had taken from local social media. It showed a pile of brightly wrapped boxes, dominated by a massive, suspiciously heavy looking box wrapped in shiny dinosaur paper. Rumor on the playground was that it contained the holy grail: a limited edition motorized Go Kart. "We load the entire haul into heavy duty trash bags and quickly wheel them out the side gate," Molly said, mimicking a sweeping motion. "But what if the legendary birthday boy sees us? He has eyes like a hawk, Molly. He caught me spying," Bill asked, shivering with genuine terror. Molly pulled a small, silver canister from her pocket and set it down between them with a dramatic clink. "Standard issue bubble solution. One blast of this baby and they will be too busy popping soapy spheres to notice us," she declared triumphantly. Bill nodded slowly, flexing his arms, finally managing to zip the safety vest over his broad chest with a sharp, victorious gasp. He adjusted his blue cardboard party hat, securing the tight elastic band under his bristly chin. He took a deep breath, ready for the sugary chaos.
created_at
"2026-05-19T19:25:19.331Z"
5/19/2026, 7:25:19 PM
model
"google/gemini-3.5-flash-20260519"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1601
moderation_latency
(null)
generation_time
87392
tokens_prompt
77
tokens_completion
6707
native_tokens_prompt
67
native_tokens_completion
23619
native_tokens_completion_images
0
native_tokens_reasoning
22996
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"STOP"
service_tier
"standard"
usage
0.2126715
router
(null)
provider_responses
0
endpoint_id
"bd50061c-e1d8-469e-96b1-d322e77378f8"
id
"H7kMaqKRHbbQjMcP-aHVKA"
is_byok
false
latency
1601
model_permaslug
"google/gemini-3.5-flash-20260519"
provider_name
"Google AI Studio"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.15.0; linux; x64))"
http_referer
(null)
request_id
"req-1779218719-mnLbu8S8ym5Gw85BlW3y"
session_id
(null)
api_type
"completions"
id
"gen-1779218719-A4Ng3T8M8ovZslTAIUNi"
upstream_id
"H7kMaqKRHbbQjMcP-aHVKA"
total_cost
0.2126715
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Google AI Studio"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
99.98%
Matches word count
n/a
neededClean
false
words
502
94.38%
Dialogue to Total Word Ratio
Ratio: 31.08%, Deviation: 1.08%
neededClean
false
wordsTotal
502
wordsDialogue
156
97.1813%