NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-10-01KR8Z0TZPY26K0GPEMTYVR2NC
Run Details
Model:
Gemini 3.1 Flash Lite (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
3.6s
Cost:
$0.000966
Provider:
Google AI Studio
Evaluation summary
Score
Evaluator
Details Link
81.27%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
40.6364%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly tapped the blueprints laid across the sticky tavern table, her finger tracing a path through the suburban floor plan. Outside, the rain drummed against the glass, matching the nervous rhythm of Bill’s tapping foot. Are you sure about the clown, Bill? He is just a guy named Gary, Molly. He drives a rusted Honda and spends his breaks smoking menthols behind the bounce house. But what if he has a balloon animal weapon? It is a poodle, Molly. Not a firearm. We need precision. This party is for a set of triplets. Do you know the sheer volume of wrapping paper involved? I am talking industrial-grade glitter explosions. I checked the invite list. Forty-two guests. That means forty-two potential witnesses with sugar-fueled reflexes and no moral compass. We enter through the garage. We bypass the gift table while the kids are distracted by the magician. You mean the guy who pulls quarters from ears? Exactly. He is a glorified pickpocket, Bill. He will be our smokescreen. Do you have the collapsible sack? It is reinforced with duct tape and hope. Did you bring the laser pointer to distract the resident terrier? It is in my left pocket. What about the parents? They will be lurking with cameras. If they try to film us, we tell them we are the professional cleaners hired for the post-cake trauma. That feels risky. Some of those suburban dads are terrifying when they have had three glasses of cheap champagne. We play the role of stressed staff. We act like we are underpaid and resent everything. They will not look twice at us because they are too busy arguing about who has to drive the minivan home. You have thought of everything except the cake. What if they cut the cake? That is our extraction signal. When they sing, we move. The song is the cover. No one hears a squeaky floorboard during a chaotic chorus of Happy Birthday. Fine. But if I get hit in the eye by a rogue party popper, I am taking an extra portion of the loot. Deal. But stay away from the bouncy castle. I saw the maintenance logs, and that thing has not been sanitized since the nineties. Good point. We are stealing toys, not infectious diseases. Exactly. Now, pass me the blueprints. We need to measure the distance from the gift table to the side gate. Bill slid the paper across, his eyes darting to the tavern door. If we pull this off, the triplets are going to have a very quiet afternoon. And we are going to have enough LEGO sets to retire in luxury. To the heist, Bill. To the heist, Molly. May the glitter be sparse and the exits be clear. They downed their lukewarm coffee like it was expensive whiskey, the gravity of the mission settling into the dim room. The suburban neighborhood stood no chance against their combined incompetence.
created_at
"2026-05-10T12:50:29.502Z"
5/10/2026, 12:50:29 PM
model
"google/gemini-3.1-flash-lite-20260507"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
663
moderation_latency
(null)
generation_time
3482
tokens_prompt
77
tokens_completion
739
native_tokens_prompt
67
native_tokens_completion
633
native_tokens_completion_images
0
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"STOP"
service_tier
"standard"
usage
0.00096625
router
(null)
provider_responses
0
endpoint_id
"237af107-d2e9-46c7-b146-c74d75980dee"
id
"FX8AaqjzK6H6qtsPgJyMmQQ"
is_byok
false
latency
663
model_permaslug
"google/gemini-3.1-flash-lite-20260507"
provider_name
"Google AI Studio"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1778417429-7dD7AxzkTuicu47kaSVW"
session_id
(null)
api_type
"completions"
id
"gen-1778417429-XEjPNDC0zkwKhZX7UNdH"
upstream_id
"FX8AaqjzK6H6qtsPgJyMmQQ"
total_cost
0.00096625
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Google AI Studio"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
81.27%
Matches word count
n/a
neededClean
false
words
488
0.00%
Dialogue to Total Word Ratio
Ratio: 0.00%, Deviation: 70.00%
neededClean
false
wordsTotal
493
wordsDialogue
0
40.6364%