NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-20-01KS25VX82JNN4M4Q1F66YJ51Q
Run Details
Model:
Gemini 3.5 Flash (Reasoning, Minimal)
Test:
Dialogue tags
Scenario:
Write 200 words with 90% dialogue
Duration:
2.7s
Cost:
$0.002576
Provider:
Google AI Studio
Evaluation summary
Score
Evaluator
Details Link
97.63%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
48.8138%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Are we seriously disguising ourselves as barnyard animals, Bill? Yes, Molly, the golden retriever and the pig are the only ways we get past the bouncy castle perimeter. But the tail is itchy, and I smell like damp polyester. Focus. The target is the treasure pile by the balloon arch. Is it mostly Lego? Tell me it is high-end Lego. Mostly Lego, some iPads, and a rumored vintage comic book from the rich uncle. And the getaway vehicle? My Honda Civic is parked behind the sandbox. What about the perimeter defense? Toddlers, Molly. Dozens of highly chaotic, sticky toddlers. Do we have countermeasures? I packed juice boxes and fruit snacks. If they swarm, we dump the contraband and run. Genius. What about the giant inflatable slide? We use it as our primary insertion point. Which one of us has to wear the clown nose? You do, Molly. It distracts from your shifty eyes. My eyes are not shifty, they are calculating. Just put the nose on. Fine, but if I get cake on this fur, you are paying the dry-cleaning bill. Deal. Now, squeak your nose and let us go rob an eight-year-old.
created_at
"2026-05-20T07:51:08.815Z"
5/20/2026, 7:51:08 AM
model
"google/gemini-3.5-flash-20260519"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
751
moderation_latency
(null)
generation_time
2290
tokens_prompt
77
tokens_completion
305
native_tokens_prompt
67
native_tokens_completion
275
native_tokens_completion_images
0
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"STOP"
service_tier
"standard"
usage
0.0025755
router
(null)
provider_responses
0
endpoint_id
"bd50061c-e1d8-469e-96b1-d322e77378f8"
id
"7WcNao2UD-OkjrEP88aF6Qc"
is_byok
false
latency
751
model_permaslug
"google/gemini-3.5-flash-20260519"
provider_name
"Google AI Studio"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.15.0; linux; x64))"
http_referer
(null)
request_id
"req-1779263469-uSpHNLwNVFuqVgN1IiJx"
session_id
(null)
api_type
"completions"
id
"gen-1779263469-t5b0BJ3xeIYto5uxKuF4"
upstream_id
"7WcNao2UD-OkjrEP88aF6Qc"
total_cost
0.0025755
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Google AI Studio"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
97.63%
Matches word count
n/a
neededClean
false
words
193
0.00%
Dialogue to Total Word Ratio
Ratio: 0.00%, Deviation: 90.00%
neededClean
false
wordsTotal
197
wordsDialogue
0
48.8138%