NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-19-01KHVPFFJKCPG26G3MWY9X5M9M
Run Details
Model:
Gemini 3.1 Pro (Preview)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
3m 18s
Cost:
$0.210938
Provider:
Google AI Studio
Evaluation summary
Score
Evaluator
Details Link
100.00%
Matches word count
(details)
100.00%
Dialogue to Total Word Ratio
(details)
100.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Bill flattened the crumpled crayon drawn map against the warm hood of their stolen surveillance van. He tapped a sticky purple jelly smudge located precisely near the center of the paper. "This is the drop zone, Molly. The bounce house. At exactly two today, the clown will initiate his elaborate balloon animal sequence. That creates our window. All the parents will be distracted by the agonizing screeching of twisting rubber." Molly adjusted the collar of her tactical black turtleneck and peered closely at the chaotic drawing. "Are you really sure about the clown, Bill? What if he does magic tricks instead? Magic easily drops the overall distraction index." "I personally bribed the booking agency yesterday morning. He is strictly doing balloons today. No fluffy white rabbits out of hats, no shiny coins behind the ear. Now, the primary vault is sitting completely unguarded on the patio gift table. We are talking about thirty securely wrapped packages. Mostly building blocks, several action figures, maybe a nice new bicycle if we get extremely lucky." The suburban street was utterly quiet except for the distant thumping of a heavy bass line echoing from a massive inflatable castle. "How do we actually transport thirty separate boxes? We cannot simply stroll out the front gate holding a giant mountain of brightly colored shiny wrapping paper." "We are surely not strolling anywhere on foot. We are secretly using the birthday boy as our unwitting package mule. I successfully intercepted the main grand prize delivery. A genuinely massive battery powered pink motorized little jeep. We stack the fragile cargo safely into the back seat and leisurely drive it rapidly right out." Bill pointed a gloved finger toward a jagged red sketch indicating the garage side door. A solitary bead of cold sweat formed on his brow just thinking about the sheer volume of chaotic screaming toddlers. "You seriously want to casually drive a plastic toy truck heavily filled with stolen presents through a crowded sunny suburban backyard? What about the expensive household security cameras? What about little baby Timothy and his overly protective crazy helicopter mother?" "She is heavily medicated on cheap boxed wine by three every afternoon. We will silently slip inside during the inevitable afternoon sugar crash. The screaming children will be practically comatose from consuming excessive amounts of thick vanilla buttercream frosting. We seamlessly load the pink toy jeep, gently mash the small plastic accelerator pedal, and completely vanish into the nearby incredibly quiet neighborhood." "It is absolutely brilliant. But what if one of the hyperactive bratty kids unexpectedly wakes up early from the huge sugar coma? What if little Timothy actually catches us stealing his entire wonderful birthday haul?" He reached deep inside his heavy coat pocket and slowly pulled out a giant swirled lollipop wrapped in clear plastic. Molly checked the luminous dial on her expensive wristwatch, synchronized the second hand perfectly, and gave him a grim, resolute nod. They both stepped forward now. "I specifically brought a powerful tactical silencing device."
created_at
"2026-02-19T19:36:53.423Z"
2/19/2026, 7:36:53 PM
model
"google/gemini-3.1-pro-preview-20260219"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
3402
moderation_latency
(null)
generation_time
197465
tokens_prompt
77
tokens_completion
5752
native_tokens_prompt
67
native_tokens_completion
17567
native_tokens_completion_images
0
native_tokens_reasoning
16980
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"STOP"
usage
0.210938
router
(null)
provider_responses
0
endpoint_id
"2abf5677-0826-45d9-8470-3d62c31d54a1"
is_byok
false
latency
3402
model_permaslug
"google/gemini-3.1-pro-preview-20260219"
provider_name
"Google AI Studio"
status
200
api_type
"completions"
id
"gen-1771529813-Xr8kz1jwFXOozGmDZI2O"
upstream_id
(null)
total_cost
0.210938
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Google AI Studio"
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Matches word count
n/a
neededClean
false
words
500
100.00%
Dialogue to Total Word Ratio
Ratio: 70.00%, Deviation: 0.00%
neededClean
false
wordsTotal
500
wordsDialogue
350
100.0000%