Run: 2026-02-24-01KJ8V95ZK69SBTDAV7PAPW4E7

Run Details

Model:: Claude Opus 4.6 (Reasoning)
Test:: Dialogue tags
Scenario:: Write 200 words with 90% dialogue
Duration:: 10.0s
Cost:: $0.010285
Provider:: Amazon Bedrock

Evaluation summary

Score	Evaluator	Details Link
75.16%	Matches word count	(details)
99.91%	Dialogue to Total Word Ratio	(details)
87.5351%	Overall

Model input

User/Human

You are a author of comedic heist books.

Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents.

Only include the prose of the scene without any markdown or commentary.

Model response

"Alright, Bill, run me through it one more time," Molly said, spreading the hand-drawn map across the dashboard.

"We go in through the back gate at fourteen hundred hours. The bouncy castle provides cover on the left flank. I neutralize the piñata as a distraction while you grab the presents off the fold-out table."

"And the parents?"

"Intel says they'll be in the kitchen doing Jell-O shots by two fifteen. We'll have a clean four-minute window."

"What about the birthday boy?"

"He's six, Molly. I think we can handle a six-year-old."

"You said that about the Hendersons' kid last Easter and you still have the bite marks."

"That was different. That kid was feral. This one does watercolors."

"Fine. What's our exit strategy?"

"Over the fence, through Mrs. Patterson's garden, into the getaway minivan. I've got car seats installed so we blend in with the neighborhood."

"Brilliant. And the loot?"

"Conservative estimate, we're looking at eight to ten wrapped presents. Probably some Legos, maybe a scooter. Could be our biggest haul since the Petersons' baby shower."

Molly folded the map and grinned. "Let's go rob some children."

Evaluation details

Result

Evaluator

Details

Meta Data

75.16%

Matches word count

n/a

neededClean	false
words	187

99.91%

Dialogue to Total Word Ratio

Ratio: 91.71%, Deviation: 1.71%

neededClean	false
wordsTotal	193
wordsDialogue	177

87.5351%