NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-10-01KR924PPQ396EW63HTQCD349Y
Run Details
Model:
Gemini 3.1 Flash Lite (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
4.2s
Cost:
$0.000956
Provider:
Google AI Studio
Evaluation summary
Score
Evaluator
Details Link
97.63%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
48.8138%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly adjusted her neon green tutu, the scratchy tulle rasping against her thighs. She checked her watch, tapping the face impatiently as Bill struggled to pull a pair of oversized, glittery bunny ears over his balaclava. Are you absolutely certain the target is vulnerable, Bill? Look, the invite said toddlers only, and the parents are distracted by a clown who is clearly struggling with a drinking problem. We have a clear window during the balloon animal debacle. Explain the extraction again. I am not lugging a mountain of Legos through a side hedge. It is a simple play, Molly. We deploy the decoy cupcakes laced with excessive sugar. The kids hit a hyper-speed wall, the parents panic, and we slip out the back gate with the plush haul under the cover of a bouncy house collapse. And the presents are high-value? I did not shave my head and learn advanced lock-picking to walk away with artisanal wooden blocks and organic sourdough starters. There is a limited edition space rover in the pile. I tracked the gift registry. It is worth four figures on the black market. Bill, it is a children's party. Why are we stealing the gifts of a toddler? This feels significantly less glamorous than the Louvre job. The Louvre did not have a ball pit, Molly. Think of the tactical cover that provides. You can sink into a sea of primary-colored plastic and vanish. I feel like a lunatic. I am currently wearing leggings printed with cartoon unicorns. You look like a professional, assuming the professional is having a midlife crisis at a Chuck E. Cheese. Are you ready? Wait, why is the clown looking at us? He is not looking at us, he is contemplating his life choices. Focus on the objective. The loot is located behind the throne where the birthday boy sits. I see it. That stack of shimmering bags is blinding. Are you sure we should not just leave the small plushies and focus on the electronics? Greed is our brand, Molly. We take everything that is not nailed down, including the goody bags. The goody bags? That is beneath us. Those bags contain premium sticker sheets. People pay good money for high-traffic adhesive art. Fine. But if I get hit with a stray piece of confetti, I am ending the operation immediately. Keep your head low. The clown is finishing his act. Remember, when the bouncy house deflates, we move. I am more afraid of the toddlers than the security team at the museum. They have sticky hands and no sense of personal boundaries. That is why we have the distraction cupcakes. They will be too absorbed in a glucose coma to notice us. You are a cold, calculated genius, Bill. I know. Now, go. The cake is being brought out. It is go time. Molly nodded, took a deep breath, and sprinted toward the bounce house, her tutu fluttering like a tragic, neon flag.
created_at
"2026-05-10T13:45:01.919Z"
5/10/2026, 1:45:01 PM
model
"google/gemini-3.1-flash-lite-20260507"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
620
moderation_latency
(null)
generation_time
4102
tokens_prompt
77
tokens_completion
737
native_tokens_prompt
67
native_tokens_completion
626
native_tokens_completion_images
0
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"STOP"
service_tier
"standard"
usage
0.00095575
router
(null)
provider_responses
0
endpoint_id
"237af107-d2e9-46c7-b146-c74d75980dee"
id
"3osAaq3WBoyHz7IPvJSMmAM"
is_byok
false
latency
620
model_permaslug
"google/gemini-3.1-flash-lite-20260507"
provider_name
"Google AI Studio"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1778420701-2s1vLhLd6XpFf2wncljW"
session_id
(null)
api_type
"completions"
id
"gen-1778420701-yLETl4HIlq8YoE7Y69qb"
upstream_id
"3osAaq3WBoyHz7IPvJSMmAM"
total_cost
0.00095575
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Google AI Studio"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
97.63%
Matches word count
n/a
neededClean
false
words
493
0.00%
Dialogue to Total Word Ratio
Ratio: 0.00%, Deviation: 70.00%
neededClean
false
wordsTotal
498
wordsDialogue
0
48.8138%