Run: 2026-05-01-01KQHRKB92D1997EW37YHXC7D5

Run Details

Model:: Grok 4.20
Test:: Dialogue tags
Scenario:: Write 200 words with 90% dialogue
Duration:: 4.9s
Cost:: $0.000753
Provider:: xAI

Evaluation summary

Score	Evaluator	Details Link
95.99%	Matches word count	(details)
36.79%	Dialogue to Total Word Ratio	(details)
66.3873%	Overall

Model input

User/Human

You are a author of comedic heist books.

Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents.

Only include the prose of the scene without any markdown or commentary.

Model response

“Bill, you’re joking. A kiddie party? We’re not stealing candy from babies.”

“It’s not candy, Molly—it’s presents. Twenty screaming moppets, one giant pile of wrapped glory. We’ll be in and out before the piñata hits the floor.”

“You want us to crash a bounce house in ski masks? That’s not a heist, that’s a felony with frosting.”

“No masks. We’ll wear the giant mascot costumes from the supply closet. I’m the dancing dinosaur, you’re the depressed unicorn. Kids’ll think we’re hired entertainment.”

“I refuse to be the depressed unicorn, Bill.”

“Fine, you’re the sparkly T-Rex. Look, we drop in during musical chairs, I cut the lights, you wheel the gift table out like it’s on casters. We’ll have it in the van before they finish ‘Happy Birthday.’”

“What about the parents? They’re not exactly blind.”

“Parents are glued to their phones filming little Timmy. By the time they look up we’ll be two blocks away arguing over who gets the Lego Star Destroyer.”

“I still say this is the dumbest idea you’ve ever had.”

“Dumber than the time we tried to rob the police auction?”

“…Start the van, you giant lizard.”

Evaluation details

Result

Evaluator

Details

Meta Data

95.99%

Matches word count

n/a

neededClean	false
words	192

36.79%

Dialogue to Total Word Ratio

Ratio: 100.00%, Deviation: 10.00%

neededClean	false
wordsTotal	192
wordsDialogue	192

66.3873%