NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-03-01KJV02AS4CQ8GV8SA2J6S1QSY
Run Details
Model:
Gemini 3.1 Flash Lite (Preview)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
3.5s
Cost:
$0.000906
Provider:
Google AI Studio
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0004%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly tapped her blueprint of the suburban cul-de-sac with a manicured nail. She adjusted her silk scarf, her eyes flashing with a peculiar kind of mercenary greed. Bill, meanwhile, was carefully calibrating a device that looked suspiciously like a modified toaster. Do you have the inflatable bouncy castle neutralization kit or are you just holding that antique appliance to look mysterious, Bill? It is technically a high-frequency acoustic emitter, Molly. It will induce a localized nap phase in any toddler within a ten-foot radius. Do not be condescending about the toaster. Excellent. Now, the intel says the primary stash of birthday loot is located behind the dragon-themed cake. We go in through the garage, neutralize the clown, and secure the haul before they finish the musical chairs. Why are we targeting a five-year-olds party again? The manifest said there were diamond cufflinks at the gala next door. The gala has security guards with dogs, Bill. This party has a depressed teenager in a foam dragon suit named Kyle. Kyle is not my greatest tactical rival. Plus, have you seen the size of that pile of wrapped gifts? It is statistically likely that there is at least one high-end gaming console tucked under that sparkly paper. I still feel like a monster. What if the kid cries? Then you aim the toaster at the kid, Bill. It is not rocket science, it is nap-science. Do you have the camouflage? I have two party hats that say Happy Birthday and a tray of lukewarm mini-quiches. Perfect. We pose as the entertainment. I handle the clown, you handle the presents, and we exit via the hedge maze. The maze is plastic, Molly. It stands three feet high. Then we sprint gracefully, Bill. Do you always have to be so literal? I just want to ensure we do not end up in municipal jail because you insisted on a dramatic exit through a toddler-sized garden toy. It is about the spectacle. If we are going to be petty thieves stealing plastic toys and potentially glitter-covered gift cards, we should at least maintain an air of sophistication. Now, tuck that toaster into your cummerbund and prepare your best frantic parent impersonation. I do not have a frantic parent impersonation. Then look like you have lost a child. It is the universal language of suburban panic. I feel like this is beneath our skill set. We broke into the Louvre last year. And we walked away with a souvenir postcard instead of the Mona Lisa because you got distracted by the gift shop. At least today, the presents are already wrapped. It is a logistical triumph. Fine. Let us liberate those gifts, Molly. For the cause. For the pure, unadulterated petty larceny, Bill. Now, check your tie. We have a birthday to ruin.
created_at
"2026-03-03T23:20:55.856Z"
3/3/2026, 11:20:55 PM
model
"google/gemini-3.1-flash-lite-preview-20260303"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
515
moderation_latency
(null)
generation_time
3260
tokens_prompt
77
tokens_completion
711
native_tokens_prompt
67
native_tokens_completion
593
native_tokens_completion_images
0
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"STOP"
usage
0.00090625
router
(null)
provider_responses
0
endpoint_id
"e004ce39-eb0c-4474-93d5-55072e725970"
is_byok
false
latency
515
model_permaslug
"google/gemini-3.1-flash-lite-preview-20260303"
provider_name
"Google AI Studio"
status
200
api_type
"completions"
id
"gen-1772580055-IOdFJ3PCZ7bRw1zTollx"
upstream_id
(null)
total_cost
0.00090625
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Google AI Studio"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
467
0.00%
Dialogue to Total Word Ratio
Ratio: 0.00%, Deviation: 70.00%
neededClean
false
wordsTotal
479
wordsDialogue
0
0.0004%