Run: 2024-07-23-01J3G80KGQM95TZS5FY1VH3M7Y

Run Details

Model:: Llama 3.1 70B
Test:: Dialogue tags
Scenario:: Write 200 words with 90% dialogue
Duration:: 3.3s
Cost:: $0.000313
Provider:: Fireworks

Evaluation summary

Score	Evaluator	Details Link
97.63%	Matches word count	(details)
36.79%	Dialogue to Total Word Ratio	(details)
67.2078%	Overall

Model input

User/Human

You are a author of comedic heist books.

Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents.

Only include the prose of the scene without any markdown or commentary.

Model response

"I'm telling you, Bill, this is a solid plan. A kids' party, no security to speak of, and a room full of sugar-high obliviousness."

"Obviousness, not obliviousness, Molly. And I'm not sure I'm comfortable with this. It's a bunch of kids' birthday presents. It's not exactly the score of a lifetime."

"Come on, it's a score, nonetheless. And have you seen the sizes of those gift boxes? Those kids are probably getting iPads and bicycles. We can make a killing on eBay."

"But what if we get caught? I don't want to spend my twenty-fifth birthday in a kiddie cellblock."

"We won't get caught. We'll be in and out in ten minutes. I've got a disguise for us – who's going to suspect the piñata and the clown?"

"Aren't you the one who always says, 'Don't dress as a clown, you're just asking for trouble'?"

"Desperate times, Bill. Desperate times. Besides, I've got a great idea for a squirt gun and a whoopee cushion."

"You're not taking this seriously, are you?"

"I'm taking it very seriously. I'm taking thirty-five percent of the profits, and you're getting the rest. Now, are you in or out?"

Evaluation details

Result

Evaluator

Details

Meta Data

97.63%

Matches word count

n/a

neededClean	false
words	193

36.79%

Dialogue to Total Word Ratio

Ratio: 100.00%, Deviation: 10.00%

neededClean	false
wordsTotal	196
wordsDialogue	196

67.2078%

id	"gen-chB8paFzGISv7ktcmBKxLclBjMud"
total_cost	0.0003132
upstream_id	"5aae525c-bc1e-4c74-bd2c-639780c1f1e6"
created_at	"2024-07-23T16:50:26.95566+00:00" 7/23/2024, 4:50:26 PM
model	"meta-llama/llama-3.1-70b-instruct"
app_id	182717
streamed	false
cancelled	false
provider_name	"Fireworks"
latency	3103
moderation_latency	(null)
generation_time	0
finish_reason	"stop"
tokens_prompt	102
tokens_completion	276
native_tokens_prompt	70
native_tokens_completion	278
num_media_prompt	(null)
num_media_completion	(null)
origin	"https://ncbench.com/"
usage	0.0003132