NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-10-01KR92WP7X33WMCTFE723QXHQW
Run Details
Model:
Gemini 3.1 Flash Lite (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
3.5s
Cost:
$0.000977
Provider:
Google AI Studio
Evaluation summary
Score
Evaluator
Details Link
75.16%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
37.5778%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly adjusted the neon headband shaped like a unicorn horn and glared at the pristine suburban driveway. Bill stood beside her, his tuxedo jacket straining over his stomach as he clutched a bag of glitter cannons. Are you sure there is not a buffet involved, Molly? Focus, Bill. We are here for the loot, not the lukewarm chicken nuggets. The loot is technically toys for six-year-olds. It feels like we are losing our edge. It is a birthday heist, Bill. The kids are currently distracted by a magician named Professor Pumpernickel. That is our window. Professor Pumpernickel is a professional clown, Molly. Do you think he carries a backup weapon concealed in his giant shoes? If he pulls a gat out of a balloon animal, I will retire immediately. Now, grab the duffel bag. We need to clear the gift table before the cake cutting. What if they catch us? Do we have an exit strategy that does not involve running from a pack of juice-box-fueled toddlers? I have smoke bombs. Glitter smoke bombs? Yes, they are festive. It keeps the aesthetic consistent. I feel like a lunatic. I am literally dressed like a backup dancer for a My Little Pony convention. You look like a threat to the suburban establishment. That is the point. Are the masks ready? I have a monkey mask. You have the bunny ears. Perfect. When we enter, you create a diversion by pretending to be the entertainment. Perform a card trick or something. I cannot do card tricks. Then tell them a story about a dragon. Just keep them away from the table. What do I do when I run out of dragon plot points? Start asking them existential questions about their homework load. That usually quiets them. That is dark, even for you. It is efficient. Do you see the primary target? The pile of wrapped boxes under the gazebo? Yes. It looks like a mountain of Lego and desperation. That is our gold mine, Bill. Grab the bag. One more question before we commit to this absurdity. Make it fast. Do we actually take the cake? It smells like funfetti. We are thieves, Bill, not savages. Leave the cake. Take the presents and move fast. On my mark. My bunny ears are pinching my head. Suffer for the art of the steal. Three. Two. One. Go. They sprinted toward the gazebo, Molly checking her tactical watch while Bill tripped over a dropped juice box. The music from the magician’s tent blared, masking their footsteps as they reached the gift table. Molly swept a mountain of boxes into the bag with ruthless precision. Get the one that looks like a bike, Bill. It is a box, Molly. They all look like boxes. The big rectangular one. Move your feet. I hear a pinata being attacked. We are really doing this, aren't we? We are legends, Bill. Now jump the fence.
created_at
"2026-05-10T13:58:07.877Z"
5/10/2026, 1:58:07 PM
model
"google/gemini-3.1-flash-lite-20260507"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
539
moderation_latency
(null)
generation_time
3340
tokens_prompt
77
tokens_completion
737
native_tokens_prompt
67
native_tokens_completion
640
native_tokens_completion_images
0
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"STOP"
service_tier
"standard"
usage
0.00097675
router
(null)
provider_responses
0
endpoint_id
"237af107-d2e9-46c7-b146-c74d75980dee"
id
"8I4Aavu-A6yez7IPyLygoQg"
is_byok
false
latency
539
model_permaslug
"google/gemini-3.1-flash-lite-20260507"
provider_name
"Google AI Studio"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1778421487-uZLTn71fcSVtIK4yL3Rd"
session_id
(null)
api_type
"completions"
id
"gen-1778421487-HIjcJ4CvRfbk8AFwbTfX"
upstream_id
"8I4Aavu-A6yez7IPyLygoQg"
total_cost
0.00097675
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Google AI Studio"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
75.16%
Matches word count
n/a
neededClean
false
words
487
0.00%
Dialogue to Total Word Ratio
Ratio: 0.00%, Deviation: 70.00%
neededClean
false
wordsTotal
491
wordsDialogue
0
37.5778%