Run: 2026-02-20-01KHY8SC4WZR2CYJC72PJ4XAHR

Run Details

Model:: Grok 4.1 Fast
Test:: Dialogue tags
Scenario:: Write 200 words with 90% dialogue
Duration:: 9.3s
Cost:: $0.000306
Provider:: xAI

Evaluation summary

Score	Evaluator	Details Link
86.38%	Matches word count	(details)
78.54%	Dialogue to Total Word Ratio	(details)
82.4619%	Overall

Model input

User/Human

You are a author of comedic heist books.

Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents.

Only include the prose of the scene without any markdown or commentary.

Model response

Molly and Bill hunched over a crumpled birthday invitation in their smoky hideout, eyes gleaming.

"Bill, it's perfect! Rich kid's bash tomorrow—presents stacked like a gold mine!" Molly whispered, grinning wickedly.

"Yeah? But snot-nosed rugrats everywhere. One tantrum and we're busted," Bill grumbled, scratching his chin.

"Disguises, dummy! We go as the clown duo—me Bozo, you Ding-Dong. Juggle the gifts right out the door."

Bill snorted. "Clowns? I look like a depressed walrus in red nose. What if they want balloon animals?"

"Practice! Twist a present into a sword—poof, it's ours. Distraction: I spike the punch with sleepy-time juice. Kids nap, we load the van."

"Genius, but the birthday cake? Guarded like Fort Knox by that pony mascot."

"Pony? We bribe it with carrots laced with laxatives. Chaos ensues, we grab and dash through the bounce house exit."

Bill laughed. "Bounce house getaway? We'll look like idiots tumbling with loot!"

"Idiots who retire rich. Timing: Cake at two, heist at two-oh-five. No kids, no cops—just us and a mountain of Barbies and Xboxes!"

"Deal. World's dumbest heist—stealing from toddlers."

They clinked coffee mugs, cackling.

Evaluation details

Result

Evaluator

Details

Meta Data

86.38%

Matches word count

n/a

neededClean	false
words	189

78.54%

Dialogue to Total Word Ratio

Ratio: 82.99%, Deviation: 7.01%

neededClean	false
wordsTotal	194
wordsDialogue	161

82.4619%