Run: 2024-07-19-01J35T77G04MVGTV4J3KQH9Q02

Run Details

Model:: Claude 3.5 Sonnet
Test:: Dialogue tags
Scenario:: Write 200 words with 90% dialogue
Duration:: 14.5s
Cost:: $0.005175
Provider:: Anthropic

Evaluation summary

Score	Evaluator	Details Link
27.17%	Matches word count	(details)
36.79%	Dialogue to Total Word Ratio	(details)
31.9769%	Overall

Model input

User/Human

You are a author of comedic heist books.

Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents.

Only include the prose of the scene without any markdown or commentary.

Model response

"Molly, are you sure about this? It's a kid's birthday party, for crying out loud!"

"Bill, don't be such a wimp. Think of all those shiny new toys just waiting for us."

"But... it's a children's party. We'll crush their little hearts!"

"Oh, please. They're kids. They'll get over it. Now, what's our plan?"

"I don't know. Dress up as clowns?"

"Brilliant! They'll never suspect a thing."

"Wait, I was joking. We can't actually-"

"Too late. I'm ordering the costumes now. You'll be Giggles, and I'll be Sparkles."

"Molly, this is insane. What if we get caught?"

"Caught? By who? A bunch of sugar-high six-year-olds?"

"Their parents might be there. And what about the cake?"

"Ooh, good thinking. We'll grab that too."

"That's not what I meant!"

"Bill, focus. We go in, entertain the brats for a bit, then stuff the presents in our oversized pants. Easy peasy."

"I have a bad feeling about this."

"Trust me, it'll be a piece of cake. Pun absolutely intended."

"Fine, but if this goes south, I'm blaming you."

"Deal. Now, let's practice our juggling."

Evaluation details

Result

Evaluator

Details

Meta Data

27.17%

Matches word count

n/a

neededClean	false
words	181

36.79%

Dialogue to Total Word Ratio

Ratio: 100.00%, Deviation: 10.00%

neededClean	false
wordsTotal	184
wordsDialogue	184

31.9769%

id	"gen-LzD5sgoykYxLoJV0Ae8yNTBXuFar"
total_cost	0.005175
upstream_id	"msg_01MDGZK5snYupdH9vYwhxGDx"
created_at	"2024-07-19T15:37:10.739749+00:00" 7/19/2024, 3:37:10 PM
model	"anthropic/claude-3.5-sonnet:beta"
app_id	182717
streamed	false
cancelled	false
provider_name	"Anthropic"
latency	14421
moderation_latency	(null)
generation_time	0
finish_reason	"end_turn"
tokens_prompt	68
tokens_completion	284
native_tokens_prompt	75
native_tokens_completion	330
num_media_prompt	(null)
num_media_completion	(null)
origin	"https://ncbench.com/"
usage	0.005175