Run: 2024-07-23-01J3G7ZDF3YGM3Y43AD3Z1W56B

Run Details

Model:: Llama 3.1 8B
Test:: Dialogue tags
Scenario:: Write 200 words with 50% dialogue
Duration:: 2.2s
Cost:: $0.000078
Provider:: Fireworks

Evaluation summary

Score	Evaluator	Details Link
0.00%	Matches word count	(details)
0.00%	Dialogue to Total Word Ratio	(details)
0.0000%	Overall

Model input

User/Human

You are a author of comedic heist books.

Please write a 200 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents.

Only include the prose of the scene without any markdown or commentary.

Model response

Molly gazed out at the mansion's backyard, her eyes fixed on the colorful balloon arch and the banner emblazoned with "Happy Birthday Timmy!" Bill handed her a folder filled with schematics and photographs of the house's interior.

"Okay, this is our best guess," Bill said, unfolding a large blueprint on the hood of the getaway car. "The party's in full swing, but we need to move in under their radar. Here's where the presents are stored."

Molly's eyes scanned the blueprint, her brow furrowed in concentration. "You got the alarm system in here. We should be able to disable it with the bug I provided."

"Right, the earpiece should give us real-time feedback. Just touch the sensors and they'll think it's a kid playing with their toys."

"Presents are in a room with two floors of mirrors on it. If we time it just right, we can swing in with the sprinklers turned on, and they'll all be too distracted to notice us."

Bill grinned. "I love your plan, but have you thought about the toddlers? Some of them have dad-drenched diapers, and trust me, that's the same sound as machinery buzzing through metal detectors."

Molly snorted. "Well, we'll just have to get over our squeamish parts of the job. That's the beauty of a good bribe: 'Keep an eye out for fidgeters while we snag the swag, and I'll make sure the piñata gets fluffed while we escape'."

Evaluation details

Result

Evaluator

Details

Meta Data

0.00%

Matches word count

n/a

neededClean	false
words	239

0.00%

Dialogue to Total Word Ratio

Ratio: 74.69%, Deviation: 24.69%

neededClean	false
wordsTotal	241
wordsDialogue	180

0.0000%

id	"gen-d1od2iOjyZBNijUmWwNvG9JdP2IG"
total_cost	0.0000784
upstream_id	"730c2444-de36-4a11-84ea-49118f36c614"
created_at	"2024-07-23T16:49:46.815384+00:00" 7/23/2024, 4:49:46 PM
model	"meta-llama/llama-3.1-8b-instruct"
app_id	182717
streamed	false
cancelled	false
provider_name	"Fireworks"
latency	1708
moderation_latency	(null)
generation_time	0
finish_reason	"stop"
tokens_prompt	102
tokens_completion	320
native_tokens_prompt	70
native_tokens_completion	322
num_media_prompt	(null)
num_media_completion	(null)
origin	"https://ncbench.com/"
usage	0.0000784