Run: 2026-05-29-01KSSJSZFX504180J7JFQCSH6G

Run Details

Model:: Claude Opus 4.8 (Reasoning)
Test:: Dialogue tags
Scenario:: Write 200 words with 90% dialogue
Duration:: 9.4s
Cost:: $0.013465
Provider:: Anthropic

Evaluation summary

Score	Evaluator	Details Link
99.92%	Matches word count	(details)
93.59%	Dialogue to Total Word Ratio	(details)
96.7531%	Overall

Model input

User/Human

You are a author of comedic heist books.

Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents.

Only include the prose of the scene without any markdown or commentary.

Model response

"So we're really doing this?" Bill asked. "Robbing a kid's birthday party?"

"Keep your voice down," Molly hissed. "And yes. Think about it. Hundreds of presents, all wrapped, all unguarded."

"Unguarded? There'll be like forty children there."

"Children, Bill. We're grown adults. What are they going to do, throw juice boxes at us?"

"My cousin got concussed by a juice box once."

"Your cousin gets concussed by everything. Focus. We go in dressed as the entertainment."

"What entertainment?"

"Clowns, Bill. We go in as clowns."

"Absolutely not. I have a fear of clowns."

"You're going to *be* a clown."

"That doesn't help! That makes it worse!"

"Fine," Molly groaned. "You can be the magician."

"Magicians steal things all the time. That's perfect cover."

"See? Now you're thinking."

"And while I do card tricks, you load the presents into the van?"

"Exactly. We're in, we're out, we're rich."

"Rich off Hot Wheels and stuffed animals."

"Those stuffed animals retail for thirty dollars each, Bill."

"You've researched this."

"I've researched everything. Pin the Tail on the Donkey starts at two. That's our window."

"Why then?"

"Blindfolds, Bill. Everyone's wearing blindfolds."

Bill paused. "Molly, that's genuinely brilliant."

"I know," she grinned. "Now grab the fake mustache."

Evaluation details

Result

Evaluator

Details

Meta Data

99.92%

Matches word count

n/a

neededClean	false
words	203

93.59%

Dialogue to Total Word Ratio

Ratio: 95.07%, Deviation: 5.07%

neededClean	false
wordsTotal	203
wordsDialogue	193

96.7531%