Run: 2024-09-20-01J881RQKKAPCDCYKA81J0AF0S

Run Details

Model:: Qwen 2.5 72B
Test:: Dialogue tags
Scenario:: Write 200 words with 50% dialogue
Duration:: 6.2s
Cost:: $0.000140
Provider:: Hyperbolic

Evaluation summary

Score	Evaluator	Details Link
51.93%	Matches word count	(details)
0.00%	Dialogue to Total Word Ratio	(details)
25.9628%	Overall

Model input

User/Human

You are a author of comedic heist books.

Please write a 200 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents.

Only include the prose of the scene without any markdown or commentary.

Model response

Molly leaned in, her eyes narrow, as she laid out the plan. "Okay, Bill, listen up. We’re hitting the kid’s party at 3 p.m. sharp. The parents are clueless and the kids are distracted. We’ll have the presents stashed in the van before you can say 'birthday cake.'"

Bill frowned, scratching his chin. "And how exactly are we getting past the piñata?"

"Piñata? Who said anything about a piñata? We’re talking about a room full of unguarded loot, Bill. Easy pickings."

"But what if a kid sees us? They’ll start crying, and then the parents—"

Molly rolled her eyes. "We’ll be in and out before they can even scream. Besides, I’ve got the perfect disguise. Check this out." She pulled out a sparkly fairy costume and a pirate hat. "I’ll be the fairy, you be the pirate. We blend in, grab the swag, and nobody’s the wiser."

Bill looked doubtful. "A fairy and a pirate? At a kid’s party? That’s the best you can come up with?"

"It’s perfect, Bill. Trust me. This is our big score. Now, let’s go get our swag on."

Evaluation details

Result

Evaluator

Details

Meta Data

51.93%

Matches word count

n/a

neededClean	false
words	184

0.00%

Dialogue to Total Word Ratio

Ratio: 80.98%, Deviation: 30.98%

neededClean	false
wordsTotal	184
wordsDialogue	149

25.9628%

id	"gen-HVuQOFeGv6rciGvhr1RmVO4GhShQ"
total_cost	0.00014
upstream_id	"chat-2f21849597a84c91bcd937603496f461"
created_at	"2024-09-20T15:45:54.19462+00:00" 9/20/2024, 3:45:54 PM
model	"qwen/qwen-2.5-72b-instruct"
app_id	182717
streamed	true
cancelled	false
provider_name	"Hyperbolic"
latency	119
moderation_latency	(null)
generation_time	5847
finish_reason	"stop"
tokens_prompt	78
tokens_completion	265
native_tokens_prompt	83
native_tokens_completion	267
native_tokens_reasoning	(null)
num_media_prompt	(null)
num_media_completion	(null)
origin	"https://ncbench.com/"
usage	0.00014