Run: 2025-11-18-01KACP6VMT6H86F7B3QZC9EG47

Run Details

Model:: Gemini 2.5 Pro
Test:: Dialogue tags
Scenario:: Write 200 words with 50% dialogue
Duration:: 20.0s
Cost:: $0.024144
Provider:: Google AI Studio

Evaluation summary

Score	Evaluator	Details Link
86.38%	Matches word count	(details)
99.54%	Dialogue to Total Word Ratio	(details)
92.9607%	Overall

Model input

User/Human

You are a author of comedic heist books.

Please write a 200 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents.

Only include the prose of the scene without any markdown or commentary.

Model response

Molly peered through the sweet-smelling hydrangea bush, the squeals of a dozen six-year-olds assaulting her ears. An inflatable unicorn wobbled precariously. Beside her, Bill lowered his binoculars with a tactical frown.

“Okay, Phase One: I distract them. I’ll go in as Barnaby the Sad Clown. Kids hate clowns. They'll run screaming.”

“No. We are not psychologically scarring children. We're just stealing their expensive LEGO sets. There’s a difference in class, Bill.”

He sighed, a floppy clown nose drooping in his hand. The air smelled of sugar and disappointment. Molly ignored him.

“Right. Class. So the giant pizza delivery guy costume is out? I could hide the presents right in the fake pepperoni.”

“Completely out. Look. The parents are drunk on punch. The doggy door is our entry point. It's simple and quiet.”

She pointed a finger towards a small flap on the back door, guarded by a terrifyingly fluffy Pomeranian. It was wearing a tiara.

“Her name is Princess Fluffybutt. I read the invitation. She has more followers than me, Mol. That dog’s a professional.”

Bill shuddered. Molly just rolled her eyes and handed him the squeaky hamburger toy.

Evaluation details

Result

Evaluator

Details

Meta Data

86.38%

Matches word count

n/a

neededClean	false
words	189

99.54%

Dialogue to Total Word Ratio

Ratio: 52.60%, Deviation: 2.60%

neededClean	false
wordsTotal	192
wordsDialogue	101

92.9607%

created_at	"2025-11-18T23:54:02.518569+00:00" 11/18/2025, 11:54:02 PM
model	"google/gemini-2.5-pro"
app_id	182717
external_user	(null)
streamed	true
cancelled	false
latency	2265
moderation_latency	(null)
generation_time	19901
tokens_prompt	68
tokens_completion	965
native_tokens_prompt	67
native_tokens_completion	2406
native_tokens_completion_images	0
native_tokens_reasoning	2140
native_tokens_cached	0
num_media_prompt	(null)
num_input_audio_prompt	(null)
num_media_completion	0
num_search_results	(null)
origin	"https://ncbench.com/"
is_byok	false
finish_reason	"stop"
native_finish_reason	"STOP"
usage	0.02414375
api_type	"completions"
id	"gen-1763510021-ILp6KsuqdjhT986tDvKu"
upstream_id	(null)
total_cost	0.02414375
cache_discount	(null)
upstream_inference_cost	0
provider_name	"Google AI Studio"