NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-10-01KR8XPN1GVEK60ZD5MWWZ5BSS
Run Details
Model:
Gemini 3.1 Flash Lite
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
4.0s
Cost:
$0.000899
Provider:
Google AI Studio
Evaluation summary
Score
Evaluator
Details Link
68.10%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
34.0511%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The humidity inside the air duct made Molly’s hair frizz into an approximation of a sentient thunderstorm, but she kept her gaze fixed on the reinforced plastic grate. Below them sat the suburban living room of the Henderson estate, currently occupied by three dozen shrieking toddlers and a clown who looked like he had not slept since the late nineties. Bill hovered behind her, his tactical vest snagging on a rivet, sounding like a zipper being dragged across a chalkboard. He tapped her shoulder, his breathing heavy and unnecessarily dramatic given their precarious position. Molly, do you think the clown is packing heat or just sheer psychological trauma? Focus, Bill, because the clown is irrelevant to the objective. The objective is a six-foot pile of neon-wrapped loot, and I am worried about the structural integrity of that glitter-covered table. If you touch the table, the alarm triggers, so stick to the overhead pulley system I spent three weeks calibrating. Three weeks to steal toys from children who cannot even read a clock yet? They are expensive toys, Bill, and that spoiled brat celebrates his birthday in a house with a heated pool, so they can afford the loss. Is there a moral threshold for larceny, or are we just picking targets based on the quality of their wrapping paper? We are picking targets based on the lack of armed security, so pipe down and pass me the rappelling harness. Here, but I still believe we should have hacked the home security system instead of crawling through a vent filled with dead moths. Hacking is for people who do not enjoy the thrill of the descent, and besides, the firewall was password protected by a toddler. That makes sense, but did you pack the snacks because I am starting to feel hypoglycemia creeping in. I packed a protein bar, and you can have half if we clear the target before the pinata ceremony starts. If we do not get out before the pinata, we are going to be pelted with plastic bats and candy by a mob of sugar-crazed five-year-olds. Then stop complaining about the moths and adjust the tension on the primary line so I can drop. The tension is perfect, but I hope you realize that stealing these presents might make us the villains of this particular story. Villains get the best lines and the biggest piles of loot, so quit moonlighting as a philosopher and hold the cable steady. Fine, but if I end up in jail for toy theft, I am blaming your lack of patience and this itchy tactical gear. Just drop the bag when I signal, and try not to knock over the decorative balloon arch on your way out. I make no promises, especially since that arch is blocking my path to the Lego sets. Just grab the Legos and go, Bill, the clowns are starting to look our way.
created_at
"2026-05-10T12:27:27.16Z"
5/10/2026, 12:27:27 PM
model
"google/gemini-3.1-flash-lite-20260507"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
602
moderation_latency
(null)
generation_time
3887
tokens_prompt
77
tokens_completion
721
native_tokens_prompt
67
native_tokens_completion
588
native_tokens_completion_images
0
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"STOP"
service_tier
"standard"
usage
0.00089875
router
(null)
provider_responses
0
endpoint_id
"237af107-d2e9-46c7-b146-c74d75980dee"
id
"r3kAaqfUFYCEz7IPx46Z0Ak"
is_byok
false
latency
602
model_permaslug
"google/gemini-3.1-flash-lite-20260507"
provider_name
"Google AI Studio"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1778416047-fK9Uy1S5GR6fR6BD8bgr"
session_id
(null)
api_type
"completions"
id
"gen-1778416047-mJX2nG3n3WBVWLFo4wwC"
upstream_id
"r3kAaqfUFYCEz7IPx46Z0Ak"
total_cost
0.00089875
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Google AI Studio"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
68.10%
Matches word count
n/a
neededClean
false
words
486
0.00%
Dialogue to Total Word Ratio
Ratio: 0.00%, Deviation: 50.00%
neededClean
false
wordsTotal
492
wordsDialogue
0
34.0511%