NC Bench
Overview
About
Tests
Models
Vendors
Compare
Run: 2026-02-25-01KJ94KM6ABC1CCS5TDXHC8A37
Run Details
Model:
Gemini 2.5 Flash Lite (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
17.6s
Cost:
$0.001811
Provider:
Google AI Studio
Evaluation summary
Score
Evaluator
Details Link
95.99%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
47.9936%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The stale interior of a beat-up van smelled faintly of old chips and desperation. Molly, hunched over a highly detailed, coffee-stained floor plan of a suburban semi, jabbed a finger at a crudely drawn star. Bill, windows fogged from his breathing, pressed his nose against the glass, a wide, almost childlike grin plastered across his face. Molly "Alright, Bill. Phase one: Reconnaissance confirmed. Target location designation: 'The Land of Unsupervised Sugar'." Bill "Look at all those colours, Molly! And the balloons! Do you think we could pocket a few confetti cannons? Just for a little celebratory blast on the way out?" Molly "Negative, Bill. We are not engaging with peripheral party detritus. Our objective is singular, high-value asset acquisition. We are here to liberate the designated 'Gifts of Joy', currently stockpiled, presumably, in the primary revenue generation chamber." Bill "The living room? So we're talking bikes? Video games? Those weirdly expensive robot dogs?" Molly "If they are present and accounted for, yes. Assume a broad spectrum of desirability. And access has been assessed. The primary guardians, known colloquially as 'Parents', will be occupied during the critical phases. Observe their current deployment: one is wrestling with a recalcitrant piñata, the other appears to be negotiating terms over a spilled juice box." Bill "They look… stressed. Maybe we should leave them some biscuits?" Molly "No. This is not a humanitarian mission. This is a precision operation. Your role, Bill, is critical. You are the Diversionary Unit. Code name: 'Uncle Barry'." Bill "Uncle Barry! I like that. Does Uncle Barry sing show tunes badly?" Molly "If it facilitates maximum distraction, yes. Your objective is to engage the guardian units with inanely lengthy anecdotes and questions about their distant relatives. Inquire about the specific gravity of the trifle. Ask if Barnaby remembers that time with the emu. Anything to create a sensory overload." Bill "Got it. I’ll really lean into the 'eccentric relative' vibe. I can even pretend to have misplaced my car keys in the shrubbery. That should buy us *at least* seven minutes. Maybe eight if I mention the garden gnome's questionable political leanings." Molly "Excellent. While you are weaving your tapestry of temporal displacement, I will effect entry via the rear patio doors. They appear to be… creatively secured. Child-proof, obviously, but not designed to withstand expert application of butter knife technique." Bill "Ooh, little locks! Like a puzzle. So then we grab the loot and what? Slide down the banister wearing silly hats?" Molly "Negative, Bill. We use the provided duffel bags, discreetly placed behind the compost bins. Extraction point is the alleyway behind the chip shop. And remember the contingency phrase if we encounter unexpected resistance." Bill "Right! 'The flamingo has flown'? 'The custard is lumpy'?" Molly "Waffles, Bill. The code is 'Waffles'. Loud and clear. No improvisations. Now, put on your best 'slightly over-enthusiastic guest' face. We blend." Bill "Blend! Like the raspberries in the… trifle! Speaking of which, do you think they'll have jelly? And are we absolutely sure these aren't presents from the Queen?"
created_at
"2026-02-25T00:53:57.071Z"
2/25/2026, 12:53:57 AM
model
"google/gemini-2.5-flash-lite"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1181
moderation_latency
(null)
generation_time
17617
tokens_prompt
77
tokens_completion
3989
native_tokens_prompt
67
native_tokens_completion
4510
native_tokens_completion_images
0
native_tokens_reasoning
3784
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"STOP"
usage
0.0018107
router
(null)
provider_responses
0
endpoint_id
"ce839073-aa24-4f29-8358-15b319bd05ec"
is_byok
false
latency
1181
model_permaslug
"google/gemini-2.5-flash-lite"
provider_name
"Google AI Studio"
status
200
api_type
"completions"
id
"gen-1771980837-KspGFVEhJ5GYd9PEaErV"
upstream_id
(null)
total_cost
0.0018107
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Google AI Studio"
Evaluation details
Result
Evaluator
Details
Meta Data
95.99%
Matches word count
n/a
neededClean
false
words
508
0.00%
Dialogue to Total Word Ratio
Ratio: 85.58%, Deviation: 15.58%
neededClean
false
wordsTotal
513
wordsDialogue
439
47.9936%