NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-20-01KS260NTFPW0TTX570NSHDWZM
Run Details
Model:
Gemini 3.5 Flash (Reasoning, Minimal)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
5.2s
Cost:
$0.006067
Provider:
Google AI Studio
Evaluation summary
Score
Evaluator
Details Link
99.92%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
49.9595%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
We are not robbing a bank, Molly, we are robbing a seven-year-old. It is a high-yield asset seizure, Bill, targeting a highly concentrated vault of unguarded Lego sets. They are in a backyard, Molly. Under a marquee. Next to a bouncy castle. Which provides perfect acoustic shielding. Do you have any idea how loud a bouncy castle compressor is? You could detonate a stick of dynamite in the sandbox and nobody would hear it over the screaming of twenty children high on blue raspberry syrup. I still think the clown disguise is a mistake. I look like a depressed watermelon. You look like a professional entertainer, which is our ticket past the perimeter fence. Did you pack the distraction? The balloon animals? Yes. I tried stretching them earlier. They smell like chalk and despair. Not those. The glitter bombs. Molly, you cannot detonate a glitter bomb at a child’s birthday. It gets in the eyes. It is a lifetime hazard. I am still finding sparkles in my ears from that jewelry heist in Antwerp. Exactly. It is a highly effective tactical screen. While the parents are busy rinsing micromylar hexagons out of little Timmy’s tear ducts, we make our move on the gift table. His name is Barnaby. Whatever. We grab the pile and run. And how, pray tell, do we transport forty pounds of brightly wrapped cardboard across a suburban lawn without attracting attention? We use the bouncy castle. You want to bounce the loot out? We deflate it, Bill. We pull the plug, wait for the kids to collapse into the sinking vinyl like prehistoric beasts in a tar pit, throw the presents into the folds, and carriage the entire unit to the van. It is genius. It is a tragedy. What if there is a pony? A pony? The invitation said there might be a pony. If there is a pony, I am not leaving without it. We do not have room in the van for a pony, Bill. The van is full of your stupid clown shoes. They are orthotic, Molly. Keeping a crowd of sugar-maddened toddlers at bay requires proper arch support. Just focus. The party starts in ten minutes. I will secure the perimeter by the barbecue. If the father tries to offer me a burnt hot dog, I will engage him in a conversation about lawn mower maintenance. That should buy you at least thirty minutes. Lawn mowers? He is a hedge-fund manager, Molly. He does not mow his own lawn. Fine, then talk to him about tax shelters. Just keep him away from the gift table. What if the kids attack me? They have sticky fingers. I hate sticky fingers. Use the balloon animals as a barrier. If they get too close, squeak them aggressively. Children fear high-frequency rubber friction. I do too, frankly. Just put the nose on, Bill. It is showtime. My dignity is dead. Dignity does not buy Lego Star Wars Millennium Falcons, Bill. Now squeak your nose and let us go.
created_at
"2026-05-20T07:53:45.044Z"
5/20/2026, 7:53:45 AM
model
"google/gemini-3.5-flash-20260519"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
850
moderation_latency
(null)
generation_time
5138
tokens_prompt
77
tokens_completion
752
native_tokens_prompt
67
native_tokens_completion
663
native_tokens_completion_images
0
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"STOP"
service_tier
"standard"
usage
0.0060675
router
(null)
provider_responses
0
endpoint_id
"bd50061c-e1d8-469e-96b1-d322e77378f8"
id
"iWgNauSFC6mRjMcPyKLbMA"
is_byok
false
latency
850
model_permaslug
"google/gemini-3.5-flash-20260519"
provider_name
"Google AI Studio"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.15.0; linux; x64))"
http_referer
(null)
request_id
"req-1779263625-B99DIS5ZwKPKzeDxpsrJ"
session_id
(null)
api_type
"completions"
id
"gen-1779263625-f4zfwi7TbRCBwYiRNxIE"
upstream_id
"iWgNauSFC6mRjMcPyKLbMA"
total_cost
0.0060675
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Google AI Studio"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
99.92%
Matches word count
n/a
neededClean
false
words
503
0.00%
Dialogue to Total Word Ratio
Ratio: 0.00%, Deviation: 70.00%
neededClean
false
wordsTotal
509
wordsDialogue
0
49.9595%