Run 2026-02-16-01KHK9WTJXQNBM6EZG2DHC20E7

Run Details

Model:: GPT-5
Test:: Dialogue tags
Scenario:: Write 200 words with 90% dialogue
Duration:: 1m 40s
Cost:: $0.068043
Provider:: OpenAI

Model input

User/Human

You are a author of comedic heist books.

Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents.

Only include the prose of the scene without any markdown or commentary.

Model response

AI

They huddled over a glittery party flyer; confetti drifted lazily while, somewhere nearby, a piñata squeaked with ominous cheer tonight.
Molly: "All right, Bill, objective: liberate the loot."
Bill: "I brought empathy, a kazoo, and glitter."
Molly: "Kazoo?"
Bill: "Diversion. Children hear jazz; adults hear defeat."
Molly: "Codenames. You're Sir Wrappington."
Bill: "Then you're Lady Bows-a-Lot."
Molly: "We enter as volunteer gift-organizers, clipboards blazing."
Bill: "Competence is my sneakiest costume."
Molly: "We'll stack presents on the Wagon of Plausibility."
Bill: "Does it squeak adorably?"
Molly: "Like a hamster doing opera."
Bill: "Contingency: if challenged, we audit sparkle levels."
Molly: "And shuffle papers knowingly."
Bill: "If chased, I deploy kazoo fugue."
Molly: "Shout, 'Conga line compliance check!'"
Bill: "Spell 'conga'?"
Molly: "Incorrectly. Confidence outruns spelling."
Bill: "Ethics inventory: are we monsters?"
Molly: "Borrowers. We return the joy as confetti."
Bill: "Post-heist snack protocol?"
Molly: "Cupcakes, seized under the Friendship Clause."
Bill: "Backup disguise?"
Molly: "Balloon animal inspectors—strict about snakes."
Bill: "I only make snakes."
Molly: "Excellent. Minimalist poodles."
Bill: "Extraction signal?"
Molly: "When the cake knife sings."
Bill: "Knives sing?"
Molly: "In butter, every blade is Pavarotti."
Bill: "Then onward, Lady Bows-a-Lot."
Molly: "With charm, Sir Wrappington—no tears, no traces. Now."

Evaluation details

Result

Evaluator

Details

Meta Data

95.9868%

Matches word count

n/a

neededClean	false
words	208

1.1749%

Dialogue to Total Word Ratio

Ratio: 75.48%, Deviation: 14.52%

neededClean	false
wordsTotal	208
wordsDialogue	157

48.5808%

Run: 2026-02-16-01KHK9WTJXQNBM6EZG2DHC20E7

Run Details

Model input

Model response

Evaluation details