Run: 2026-04-24-01KQ0G7ANC9SP946BP4AMF2SM0

Run Details

Model:: Claude Opus 4.7 (Reasoning)
Test:: Dialogue tags
Scenario:: Write 200 words with 50% dialogue
Duration:: 7.7s
Cost:: $0.012490
Provider:: Amazon Bedrock

Evaluation summary

Score	Evaluator	Details Link
99.92%	Matches word count	(details)
0.06%	Dialogue to Total Word Ratio	(details)
49.9897%	Overall

Model input

User/Human

You are a author of comedic heist books.

Please write a 200 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents.

Only include the prose of the scene without any markdown or commentary.

Model response

Molly unrolled the blueprint across the sticky diner table, pinning the corners with ketchup bottles. Bill squinted at the crayon scribbles, which appeared to depict a bouncy castle, a clown, and something that was either a piñata or a hanged man.

"This is the floor plan?" Bill asked.

"Drew it from memory. I cased the joint Tuesday."

"Molly, you were invited. Your nephew turns seven."

"Recon is recon," she said. "Now focus. The presents pile up here, by the cake table. We hit during 'Pin the Tail on the Donkey.' Blindfolds mean witnesses are useless."

"What about the parents?"

"Sugar-crashed and scrolling. We'll be ghosts."

Bill tapped a suspicious red blob. "And this?"

"Magician. Goes on at three. He's the wild card."

"A magician is the wild card?"

"Bill, the man makes rabbits disappear. He's a professional thief with better PR."

Bill rubbed his temples. "We're stealing from children."

"We're stealing from a demographic that can't identify us in a lineup. There's a difference."

"There really isn't."

"Bill." Molly grabbed his hand, tender and grave. "Do you know what a seven-year-old gets these days? iPads. Drones. Actual cryptocurrency. My nephew got Bitcoin."

Bill picked up the crayon.

Evaluation details

Result

Evaluator

Details

Meta Data

99.92%

Matches word count

n/a

neededClean	false
words	197

0.06%

Dialogue to Total Word Ratio

Ratio: 66.50%, Deviation: 16.50%

neededClean	false
wordsTotal	200
wordsDialogue	133

49.9897%