Run: 2026-02-20-01KHY9GED00CB8ZP6THGDTM4SY

Run Details

Model:: Grok 4
Test:: Dialogue tags
Scenario:: Write 200 words with 10% dialogue
Duration:: 18.6s
Cost:: $0.008035
Provider:: xAI

Evaluation summary

Score	Evaluator	Details Link
95.99%	Matches word count	(details)
0.00%	Dialogue to Total Word Ratio	(details)
47.9934%	Overall

Model input

User/Human

You are a author of comedic heist books.

Please write a 200 word scene that has 10% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents.

Only include the prose of the scene without any markdown or commentary.

Model response

Molly adjusted her oversized clown wig in the dim light of their dingy van, parked just outside the bustling suburban backyard. Balloons bobbed like drunken sentinels, and the air thrummed with the chaotic squeals of sugar-fueled kids. Bill, her partner in crime, fiddled with a comically large fake nose, his face painted in what he claimed was "stealth clown camouflage."

"This is foolproof," Bill whispered, peering through binoculars at the pile of glittering presents under a rainbow piñata. "We blend in as entertainers, grab the loot during the cake chaos, and vanish like ghosts with gift bags."

Molly snorted, eyeing the itinerary she'd swiped from the mailbox. "You're an idiot. The party's got a bouncy castle guarded by a pack of hyper toddlers. One wrong step, and we're buried in diapers."

Bill grinned, adjusting his polka-dot bowtie. "Trust me, Mol. I've got the getaway tricycle ready."

She rolled her eyes. "Fine, but if we get caught, you're the one explaining to the cops why we're dressed like discount Ronald McDonalds."

As they slipped out, a stray balloon popped, sending them both diving behind a hedge. The heist was on, ridiculous as ever.

Evaluation details

Result

Evaluator

Details

Meta Data

95.99%

Matches word count

n/a

neededClean	false
words	192

0.00%

Dialogue to Total Word Ratio

Ratio: 40.21%, Deviation: 30.21%

neededClean	false
wordsTotal	194
wordsDialogue	78

47.9934%