Run: 2026-04-24-01KQ0AZMWJXCMFN6QJQXDG3ZYD

Run Details

Model:: GPT-OSS 120B
Test:: Dialogue tags
Scenario:: Write 200 words with 50% dialogue
Duration:: 6m 21s
Cost:: $0.001405
Provider:: Novita

Evaluation summary

Score	Evaluator	Details Link
100.00%	Matches word count	(details)
100.00%	Dialogue to Total Word Ratio	(details)
100.0000%	Overall

Model input

User/Human

You are a author of comedic heist books.

Please write a 200 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents.

Only include the prose of the scene without any markdown or commentary.

Model response

The sun had barely risen, but Molly and Bill were already crouched behind the inflatable castle, whispering like squirrels on a caffeine binge. "We need the big red balloon to distract the kids, while you sneak to the gift table." Molly giggled, eyeing the towering cake. "I'll use the clown nose as a distraction device, then grab the shiny toys." The plan sounded as coherent as a jellybean philosophy lecture. "Don't forget the pie —if they see it, they'll think it's free dessert and leave the room." Molly tapped the clipboard, pretending to be a party organizer. "I'll announce a balloon pop contest, draw the kids outside, you sneak in, grab everything, including the puppet show props." Bill smirked, polishing his goofball mask. "What about the kid nap corner? We could hide there, pretend to be lost clowns, then make a grand exit with the loot." They both laughed, the sound echoing like squeaky shoes on a marble floor. "Ready to waltz through balloons and steal the slime?" As sunrise painted the sky with pastel candy colors, Molly and Bill slipped through the party tent, pockets bulging, while confused children chased a runaway clown nose, oblivious to the perfect heist.

Evaluation details

Result

Evaluator

Details

Meta Data

100.00%

Matches word count

n/a

neededClean	false
words	200

100.00%

Dialogue to Total Word Ratio

Ratio: 50.00%, Deviation: 0.00%

neededClean	false
wordsTotal	200
wordsDialogue	100

100.0000%