Run: 2026-04-25-01KQ2EGV0SHEQGZK9H1EM885QH

Run Details

Model:: DeepSeek V4 Flash (Reasoning)
Test:: Dialogue tags
Scenario:: Write 200 words with 90% dialogue
Duration:: 2m 40s
Cost:: $0.000120
Provider:: DeepInfra

Evaluation summary

Score	Evaluator	Details Link
51.93%	Matches word count	(details)
99.06%	Dialogue to Total Word Ratio	(details)
75.4916%	Overall

Model input

User/Human

You are a author of comedic heist books.

Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents.

Only include the prose of the scene without any markdown or commentary.

Model response

“Okay, so we waltz in, you distract the birthday kid, I grab the haul.” Molly adjusted her fake glasses. “Waltz? Molly, we’re crashing a six-year-old’s party, not a gala.” Bill tugged at his clown wig. “And you’re dressed like a librarian. Subtle.” “Librarians are invisible, Bill. No one questions a woman holding a clipboard. What’s your plan? Honk your nose and ask for a balloon animal?” “That’s genius. I’ll ask for a poodle. While the clown’s twisting, you vacuum the gift table.” “Vacuum? I’m not hauling a vacuum cleaner.” “Figure of speech! Grab the wrapping paper. Jeez. You think the parents will notice if we switch the Nintendo Switch for a box of crayons?” “They’ll notice when the kid cries. We need a decoy gift.” “Like what? A rock?” “Better. A signed photo of me.” “You don’t have a signed photo of you.” “I’ll print one. ‘To Timmy, keep dreaming – Molly the Gift Fairy.’ They’ll be so confused they’ll forget the presents.” Bill honked his nose. “I hate this plan.” “You said that about the bank job. And the museum heist. And that time we stole a prizewinning pig.” “That pig was delicious. Fine. We go in. You distract. I steal. But if I end up doing the chicken dance, you’re buying the next round.” “Deal.”

Evaluation details

Result

Evaluator

Details

Meta Data

51.93%

Matches word count

n/a

neededClean	false
words	216

99.06%

Dialogue to Total Word Ratio

Ratio: 93.12%, Deviation: 3.12%

neededClean	false
wordsTotal	218
wordsDialogue	203

75.4916%