NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-20-01KS25R93EQCABY4BG39KW95B7
Run Details
Model:
Gemini 3.5 Flash (Reasoning, Minimal)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
5.1s
Cost:
$0.005834
Provider:
Google AI Studio
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The blueprints are upside down, Bill. They aren’t upside down, Molly. This is a highly sophisticated rendering of the target’s play area. That is a restaurant placemat, and those are crayon drawings of a bouncy castle. Did you steal a toddler’s menu for our tactical map? It was unsupervised. Just like the gift table will be at precisely two o'clock. We strike during the ceremonial cutting of the rainbow sprinkle cake. You want us to rob a six-year-old’s birthday party. It’s Tommy’s seventh birthday, actually. And yes, I do. Do you have any idea what the street value of twenty unopened Lego Star Wars sets is right now? It’s a goldmine. We are professional thieves, Bill. We stole a Matisse from the Louvre. We lifted the Crown Jewels of Monaco. Now you want us to fence plastic bricks? And juice boxes. Don’t forget the juice boxes. High-fructose corn syrup is practically currency on the black market. I am not wearing a ski mask to a Chuck E. Cheese. We don’t wear ski masks, Molly. That’s amateur hour. We go in undercover. I’ve secured us the ultimate disguise. Please don't say it. Clown suits. I’m going to kill you. I’m going to use my bare hands and kill you right here in this safehouse. Hear me out! The big shoes are perfect for concealing flash drives, lockpicks, and juice boxes. Plus, if we get cornered, we just throw glitter in their eyes and make a run for it. Kids love glitter. They won't be blinded; they'll just be festive and sticky. Which slows them down! It's tactical glitter. And what about the parents? You think Susan from the PTA is just going to let two sweaty clowns waltz out with a wagon full of Nerf guns? I’ve prepared for Susan. We distract the adults with the ultimate weapon. An EMP? A smoke grenade? A highly dramatic debate about the local school board layout, whispered loudly near the mimosa station. They’ll be arguing about zoning laws for hours. We’ll walk right out the front door. You are a deeply disturbed man. I'm a visionary, Molly. Now, did you bring the getaway bags? I bought pillowcases. They have dinosaurs on them. Perfect. Matches the theme. We blend right in. If we get caught, Bill, and my mugshot is released to the international intelligence community showing me wearing a red foam nose, I will tell MI6 everything. They’ll never catch us. We’re ghosts. Colorful, squeaky-shoed ghosts. Just promise me one thing. Anything, partner. If there’s a piñata, I get to hit it. Deal. But we split the candy seventy-thirty. My clown suit has bigger pockets.
created_at
"2026-05-20T07:49:10.042Z"
5/20/2026, 7:49:10 AM
model
"google/gemini-3.5-flash-20260519"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
822
moderation_latency
(null)
generation_time
4796
tokens_prompt
77
tokens_completion
697
native_tokens_prompt
67
native_tokens_completion
637
native_tokens_completion_images
0
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"STOP"
service_tier
"standard"
usage
0.0058335
router
(null)
provider_responses
0
endpoint_id
"bd50061c-e1d8-469e-96b1-d322e77378f8"
id
"dmcNao_MD96n1MkPuPOO2Qs"
is_byok
false
latency
822
model_permaslug
"google/gemini-3.5-flash-20260519"
provider_name
"Google AI Studio"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.15.0; linux; x64))"
http_referer
(null)
request_id
"req-1779263350-RTw0TZbRUQNHPAGUSylx"
session_id
(null)
api_type
"completions"
id
"gen-1779263350-2tgIVRgQ0ERZ2mLpm5PP"
upstream_id
"dmcNao_MD96n1MkPuPOO2Qs"
total_cost
0.0058335
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Google AI Studio"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
441
0.00%
Dialogue to Total Word Ratio
Ratio: 0.00%, Deviation: 70.00%
neededClean
false
wordsTotal
446
wordsDialogue
0
0.0000%