Run: 2025-11-18-01KACPFK1EMZT2070X00RJETDZ

Run Details

Model:: Gemini 3 Pro (Preview)
Test:: Dialogue tags
Scenario:: Write 200 words with 50% dialogue
Duration:: 13.7s
Cost:: $0.018938
Provider:: Google AI Studio

Evaluation summary

Score	Evaluator	Details Link
90.48%	Matches word count	(details)
0.01%	Dialogue to Total Word Ratio	(details)
45.2446%	Overall

Model input

User/Human

You are a author of comedic heist books.

Please write a 200 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents.

Only include the prose of the scene without any markdown or commentary.

Model response

Molly slapped the perimeter map onto the dashboard. It was a crayon drawing Bill had bribed a neighbor’s kid to sketch.

"The target is turning six," Molly whispered, her voice grave. "Maximum security. We’re talking musical chairs, competitive parents, and a sugar-crazed mob."

Bill squinted at the wobbly lines. "Is that a guard dog or a pony?"

"It’s a piñata, Bill. Focus. We insert through the bounce house while the clown is making balloon animals."

"The bounce house?" Bill asked, wiping grease on his tactical turtleneck. "The static electricity alone could fry our comms."

"Wear rubber soles," she snapped. "Once I neutralize the clown with a well-timed whoopee cushion, you sweep the gift table. The Lego sets are priority alpha."

Bill nodded solemnly. "What about the cake?"

"Forget the cake."

"I can’t work on an empty stomach, Mol. It’s buttercream."

Molly rubbed her temples, sensing a migraine brewing. "Fine. One slice. But if you get sprinkles on the merchandise, we leave you behind."

"Roger that," Bill said, checking the pressure on his Super Soaker. "Operation Birthday Bash is a go."

"Put the toy away, Bill. You have zero trigger discipline."

Evaluation details

Result

Evaluator

Details

Meta Data

90.48%

Matches word count

n/a

neededClean	false
words	190

0.01%

Dialogue to Total Word Ratio

Ratio: 67.71%, Deviation: 17.71%

neededClean	false
wordsTotal	192
wordsDialogue	130

45.2446%

created_at	"2025-11-18T23:58:42.379439+00:00" 11/18/2025, 11:58:42 PM
model	"google/gemini-3-pro-preview-20251117"
app_id	182717
external_user	(null)
streamed	true
cancelled	false
latency	2721
moderation_latency	(null)
generation_time	13634
tokens_prompt	68
tokens_completion	835
native_tokens_prompt	67
native_tokens_completion	1567
native_tokens_completion_images	0
native_tokens_reasoning	1282
native_tokens_cached	0
num_media_prompt	(null)
num_input_audio_prompt	(null)
num_media_completion	0
num_search_results	(null)
origin	"https://ncbench.com/"
is_byok	false
finish_reason	"stop"
native_finish_reason	"STOP"
usage	0.018938
api_type	"completions"
id	"gen-1763510307-erIBujirBZUq1eESG1kq"
upstream_id	(null)
total_cost	0.018938
cache_discount	(null)
upstream_inference_cost	0
provider_name	"Google AI Studio"