Run: 2024-09-20-01J8819N4MMX2H2BW2NQV1TV73

Run Details

Model:: Qwen 2.5 72B
Test:: Dialogue tags
Scenario:: Write 200 words with 90% dialogue
Duration:: 12.4s
Cost:: $0.000152
Provider:: Hyperbolic

Evaluation summary

Score	Evaluator	Details Link
100.00%	Matches word count	(details)
100.00%	Dialogue to Total Word Ratio	(details)
99.9994%	Overall

Model input

User/Human

You are a author of comedic heist books.

Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents.

Only include the prose of the scene without any markdown or commentary.

Model response

"Bill, I'm telling you, this is the perfect gig," Molly whispered, eyes gleaming with mischief. "A kid’s party loaded with unsupervised presents? It’s like a gift from the universe."

Bill leaned back, crossing his arms. "And what, exactly, are we supposed to do with a bunch of toys and candy, Molly?"

"Resell them, genius. There’s a market for everything, even plastic dinosaurs and lollipops."

Bill shook his head. "That's a bit low, even for us. Kids are involved, remember?"

"Bill, we’re not hurting anyone. We’re just relocating the goods. Think of it as a redistribution of wealth."

"Redistribution, huh? What’s our cover story if someone asks why we’re in a bouncy castle?"

"We’ll say we’re clowns. Professional clowns hired to entertain the kids. No one will question a guy in a fake nose."

"Right, because I look great in a polka-dotted jumpsuit."

"Bill, this is easy money. And think about it—no security cameras, no high-tech alarms. Just a bunch of sugared-up kids who won’t remember a thing after the cake."

Bill sighed, reluctantly. "Fine. But if I get arrested for stealing Toys 'R' Us rejects, I’m blaming you."

Molly grinned. "Deal. Now, let’s go find some face paint."

Evaluation details

Result

Evaluator

Details

Meta Data

100.00%

Matches word count

n/a

neededClean	false
words	199

100.00%

Dialogue to Total Word Ratio

Ratio: 89.60%, Deviation: 0.40%

neededClean	false
wordsTotal	202
wordsDialogue	181

99.9994%

id	"gen-TMPXkvqQDGcw7pOythot5lvsVG1i"
total_cost	0.000152
upstream_id	"chat-375bca55b4d54a4487ddac014bb33f21"
created_at	"2024-09-20T15:37:46.424757+00:00" 9/20/2024, 3:37:46 PM
model	"qwen/qwen-2.5-72b-instruct"
app_id	182717
streamed	true
cancelled	false
provider_name	"Hyperbolic"
latency	288
moderation_latency	(null)
generation_time	11932
finish_reason	"stop"
tokens_prompt	78
tokens_completion	295
native_tokens_prompt	83
native_tokens_completion	297
native_tokens_reasoning	(null)
num_media_prompt	(null)
num_media_completion	(null)
origin	"https://ncbench.com/"
usage	0.000152