Run: 2026-05-29-01KSSJZMN4TGNNDWGAR7V3D1FM

Run Details

Model:: Claude Opus 4.8 (Reasoning)
Test:: Dialogue tags
Scenario:: Write 200 words with 50% dialogue
Duration:: 11.2s
Cost:: $0.014890
Provider:: Anthropic

Evaluation summary

Score	Evaluator	Details Link
99.92%	Matches word count	(details)
0.00%	Dialogue to Total Word Ratio	(details)
49.9595%	Overall

Model input

User/Human

You are a author of comedic heist books.

Please write a 200 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents.

Only include the prose of the scene without any markdown or commentary.

Model response

Molly spread the crayon-drawn map across the table, smoothing its glittery edges.

"The party starts at noon," she said, tapping a sticker. "Bouncy castle here, clown there, and the present table is the soft underbelly."

Bill squinted through his binoculars at the suburban house. "There's a six-year-old guarding the gifts. Looks vicious."

"That's the birthday girl, Bill."

"She's wearing a tiara. Royalty's always trouble."

Molly sighed and snatched the binoculars away. "Focus. We go in disguised as entertainers. You're the magician, I'm the face-painter."

"I don't know any magic."

"Neither do real magicians. You wave, you mumble, you palm a present."

Bill chewed his thumbnail nervously. "What about the parents?"

"Distracted by juice boxes and existential dread. Trust me, nobody watches the loot once the piñata swings."

"And the dog?"

"There's a dog?"

"Big one. Named Sprinkles, apparently."

Molly considered this, then pulled out a sandwich bag of sausages. "Sprinkles gets bribed. Everyone gets bribed at these things."

Bill grinned, finally relaxing. "I love an inside job."

"It's a children's party, not a bank."

"Same adrenaline," he insisted, fluffing his cape. "Now, do these spandex tights make me look like I steal candy?"

"Yes," Molly said. "Perfect."

Evaluation details

Result

Evaluator

Details

Meta Data

99.92%

Matches word count

n/a

neededClean	false
words	197

0.00%

Dialogue to Total Word Ratio

Ratio: 69.65%, Deviation: 19.65%

neededClean	false
wordsTotal	201
wordsDialogue	140

49.9595%