NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-10-01KR8VG4E52M1QF4M64R73YZT1
Run Details
Model:
Gemini 3.1 Flash Lite (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
4.3s
Cost:
$0.000936
Provider:
Google AI Studio
Evaluation summary
Score
Evaluator
Details Link
95.99%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
47.9934%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The sprawling suburban backyard was decorated in a frantic explosion of pastel streamers and deflated helium balloons that drooped like sad, colorful jellyfish. Molly crouched behind a giant fiberglass cupcake, her tactical gear contrasting sharply with the discarded juice boxes littering the grass. She adjusted her goggles and nudged Bill, who was currently wrestling with a bag of confetti that had snagged on his belt loop. The target sat under a white canopy tent: a small mountain of foil-wrapped boxes, bows, and high-tech plastic gadgets that promised a lucrative afternoon of illicit acquisition. Bill wiped sweat from his brow and whispered, Listen, are you certain nobody checks the gift table during the cake cutting ceremony? Molly peered around the cupcake. They are entirely distracted by the clown, who is currently failing to juggle rubber chickens. That is our window of opportunity, Bill. Bill frowned at the scene. I really hate clowns. They smell like wet basement and existential dread. Molly rolled her eyes and checked her watch. Focus. Once the song starts, we move in, swap the stack for the decoy boxes filled with bricks, and retreat through the hedge. But what about the kids? Bill asked, his voice rising in panic. We are talking about thirty sugar-fueled toddlers. They have the sensory input of a pack of wolves and the coordination of drunken seals. They will swarm us. We are professional thieves, Molly hissed, gripping her harness. A group of four-year-olds is not a tactical threat. Keep your eyes on the pile, not the toddlers. Bill adjusted his mask and sighed. I just feel like there might be collateral damage involving sticky fingers and glitter. There will be no glitter, Molly snapped. You prepared the decoys exactly as I instructed? Bill nodded slowly. Yes, but I might have accidentally packed a few leftover party favors in box three. If they open it early, they will be very confused by the abundance of tiny plastic whistles. That is an unnecessary variable, she groaned, but we are committed now. Remember, grab the gift cards first. Those are untraceable and light enough to carry without slowing our exit speed. Don't worry about me, Bill muttered while checking his boots. I am a master of speed and silence. Molly watched the clown tumble into a fountain, sending a spray of lukewarm water over the front row of pint-sized spectators. The diversion had begun. She stood up, her shadow falling across the grass like a blade. Move now, she commanded. Bill scuttled out from behind the cupcake, his footsteps muffled by the rhythmic thumping of pop music. They navigated the obstacle course of abandoned plastic slides and tricycles with the grace of seasoned burglars. As they reached the canopy, the mountain of presents glowed under the afternoon sun like a pirate hoard. Molly grabbed the largest box, her heart hammering against her ribs, while Bill skillfully mirrored her actions, swapping their decoys with trembling, eager hands.
created_at
"2026-05-10T11:48:56.396Z"
5/10/2026, 11:48:56 AM
model
"google/gemini-3.1-flash-lite-20260507"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
629
moderation_latency
(null)
generation_time
4151
tokens_prompt
77
tokens_completion
771
native_tokens_prompt
67
native_tokens_completion
613
native_tokens_completion_images
0
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"STOP"
service_tier
"standard"
usage
0.00093625
router
(null)
provider_responses
0
endpoint_id
"237af107-d2e9-46c7-b146-c74d75980dee"
id
"qHAAaqaWJJ-cz7IPqefF2Ao"
is_byok
false
latency
629
model_permaslug
"google/gemini-3.1-flash-lite-20260507"
provider_name
"Google AI Studio"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1778413736-QRv3xjCwdtZ1kRzjXK3B"
session_id
(null)
api_type
"completions"
id
"gen-1778413736-vDBephowvDybkjGFK5YR"
upstream_id
"qHAAaqaWJJ-cz7IPqefF2Ao"
total_cost
0.00093625
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Google AI Studio"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
95.99%
Matches word count
n/a
neededClean
false
words
492
0.00%
Dialogue to Total Word Ratio
Ratio: 0.00%, Deviation: 30.00%
neededClean
false
wordsTotal
498
wordsDialogue
0
47.9934%