Run: 2026-02-16-01KHK2HDYEK7T35B4CR07Z4SFR

Run Details

Model:: Qwen 3.5 397B A17B
Test:: Dialogue tags
Scenario:: Write 200 words with 90% dialogue
Duration:: 5m 25s
Cost:: $0.048700
Provider:: Alibaba

Evaluation summary

Score	Evaluator	Details Link
100.00%	Matches word count	(details)
99.99%	Dialogue to Total Word Ratio	(details)
99.9940%	Overall

Model input

User/Human

You are a author of comedic heist books.

Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents.

Only include the prose of the scene without any markdown or commentary.

Model response

"Wait, they're six," Bill said, peering through binoculars.
"Exactly. Zero security awareness," Molly replied, checking set.
"They scream if you touch their balloons."
"We wear clown suits. Camouflage. Perfect."
"I'm not wearing a nose, Mol. It chafes."
"Fine. Just the wig. And the shoes."
"What about the parents? They watch everything."
"Drunk on punch. Spiked apple juice. Trust me."
"This feels wrong. Literally infants."
"It's LEGO, Bill. Pure liquid asset. Untraceable bricks."
"There's a bouncy castle. Big Hazard."
"Distraction. You jump, I grab. Simple mechanics."
"I hate bouncing. My knees are shot."
"Think of the retirement fund. Need equity."
"Stolen Play-Doh isn't a pension plan, Molly."
"It's the limited edition set. Rare value."
"Why are we like this?"
"Because rent is due. Landlord is impatient."
"Plan B? If the clowns fail."
"Magic show. Saw a rabbit in a hat."
"We're stealing a rabbit? That's animal trafficking."
"No, we steal the gifts while they watch the rabbit."
"That's clever. Disgusting, but clever."
"I know. Grab the bag. Check straps."
"Do we have to sing Happy Birthday? I really deaf."
"Only if they ask. Stay silent."
"Ready?" Bill asked, zipping up duffel.
"Born ready," Molly said, pulling greasepaint. "Let's rob some toddlers."

Evaluation details

Result

Evaluator

Details

Meta Data

100.00%

Matches word count

n/a

neededClean	false
words	200

99.99%

Dialogue to Total Word Ratio

Ratio: 91.04%, Deviation: 1.04%

neededClean	false
wordsTotal	201
wordsDialogue	183

99.9940%