Run: 2024-07-23-01J3G814ECG2TJA71KQW7DKSJN

Run Details

Model:: Llama 3.1 70B
Test:: Dialogue tags
Scenario:: Write 200 words with 90% dialogue
Duration:: 4.0s
Cost:: $0.000341
Provider:: Fireworks

Evaluation summary

Score	Evaluator	Details Link
93.65%	Matches word count	(details)
36.79%	Dialogue to Total Word Ratio	(details)
65.2188%	Overall

Model input

User/Human

You are a author of comedic heist books.

Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents.

Only include the prose of the scene without any markdown or commentary.

Model response

"I'm telling you, Bill, it's a goldmine. A bunch of sugar-fueled kids running around, distracted by a guy in a furry suit. No one will even notice us."

"Are you insane? We're thieves, Molly, not monsters. We don't steal from kids on their birthdays."

"Oh, come on. It's not like we're stealing their innocence. Just some overpriced toys and useless trinkets. They'll get more this year, I'm sure."

"I don't know, Molly. I've got a soft spot for kids. I was one once, remember?"

"Aww, Bill, you're not going to start telling me about your rough childhood and how it's made you sympathize with our victims, are you?"

"Hey, my childhood was not that rough. I just...I don't know, I guess I'm just getting a little old for this sort of thing. What's the take, anyway?"

"Oh, I'd say at least a few thousand. And it's not like anyone's going to be checking the guest list or anything. We just walk in, look like a couple of clueless parents, and...voilà."

"Voilà? Really? You've been watching too much Ocean's Eleven."

"Hey, that's a great movie. And besides, it's a legitimate way to gather inspiration for our craft."

"Uh-huh. Right. Because nothing says 'craft' like a bunch of thieves in suits."

Evaluation details

Result

Evaluator

Details

Meta Data

93.65%

Matches word count

n/a

neededClean	false
words	209

36.79%

Dialogue to Total Word Ratio

Ratio: 100.00%, Deviation: 10.00%

neededClean	false
wordsTotal	211
wordsDialogue	211

65.2188%

id	"gen-v3aOpt1D4cLLOd5TSK1OBPJV92UD"
total_cost	0.0003411
upstream_id	"1e31df48-7ba0-461e-b960-fb28b6e45b9b"
created_at	"2024-07-23T16:50:44.991201+00:00" 7/23/2024, 4:50:44 PM
model	"meta-llama/llama-3.1-70b-instruct"
app_id	182717
streamed	false
cancelled	false
provider_name	"Fireworks"
latency	3894
moderation_latency	(null)
generation_time	0
finish_reason	"stop"
tokens_prompt	102
tokens_completion	307
native_tokens_prompt	70
native_tokens_completion	309
num_media_prompt	(null)
num_media_completion	(null)
origin	"https://ncbench.com/"
usage	0.0003411