Run: 2026-02-20-01KHY8NH91ZCHV4YKVK6BC4RNG

Run Details

Model:: Gemma 3 12B
Test:: Dialogue tags
Scenario:: Write 200 words with 10% dialogue
Duration:: 7.0s
Cost:: $0.000036
Provider:: DeepInfra

Evaluation summary

Score	Evaluator	Details Link
14.30%	Matches word count	(details)
0.00%	Dialogue to Total Word Ratio	(details)
7.1509%	Overall

Model input

User/Human

You are a author of comedic heist books.

Please write a 200 word scene that has 10% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents.

Only include the prose of the scene without any markdown or commentary.

Model response

The community hall smelled overwhelmingly of sugar and misplaced enthusiasm. Streamers, a sticky, pink menace, draped from every available surface. Molly surveyed the scene with a professional grimace. A veritable avalanche of brightly wrapped presents sat beneath a banner proclaiming "Lily's 5th Birthday Bash!" Bill, meanwhile, was attempting to discreetly sniff a giant inflatable bouncy castle.

He’d stopped short of actually jumping in, thankfully. 

"Think they'll have security?” he mumbled, more to the bouncy castle than to her.

Molly ignored him, calculating the precise angle to snag a particularly impressive-looking box adorned with sparkly unicorns. It was clearly something special.  "Likely just Brenda from next door.  She bakes like a maniac but her security skills are questionable." 

Bill finally detached himself from the inflatable beast. "Brenda? Seriously? Armed with a spatula, maybe?" He bounced slightly, a pathetic imitation of a child's glee.

Molly rolled her eyes. "Just focus, Bill. We’re after the Lego Millennium Falcon, remember? And the pony plushie. Operation Sugar Rush is a go." She reached for a nearby balloon, testing its elasticity with a suspicious look.

Evaluation details

Result

Evaluator

Details

Meta Data

14.30%

Matches word count

n/a

neededClean	false
words	179

0.00%

Dialogue to Total Word Ratio

Ratio: 28.89%, Deviation: 18.89%

neededClean	false
wordsTotal	180
wordsDialogue	52

7.1509%