NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2024-07-19-01J35T77G04MVGTV4J3KQH9Q02
Run Details
Model:
Claude 3.5 Sonnet
Test:
Dialogue tags
Scenario:
Write 200 words with 90% dialogue
Duration:
14.5s
Cost:
$0.005175
Provider:
Anthropic
Evaluation summary
Score
Evaluator
Details Link
27.17%
Matches word count
(details)
36.79%
Dialogue to Total Word Ratio
(details)
31.9769%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
"Molly, are you sure about this? It's a kid's birthday party, for crying out loud!" "Bill, don't be such a wimp. Think of all those shiny new toys just waiting for us." "But... it's a children's party. We'll crush their little hearts!" "Oh, please. They're kids. They'll get over it. Now, what's our plan?" "I don't know. Dress up as clowns?" "Brilliant! They'll never suspect a thing." "Wait, I was joking. We can't actually-" "Too late. I'm ordering the costumes now. You'll be Giggles, and I'll be Sparkles." "Molly, this is insane. What if we get caught?" "Caught? By who? A bunch of sugar-high six-year-olds?" "Their parents might be there. And what about the cake?" "Ooh, good thinking. We'll grab that too." "That's not what I meant!" "Bill, focus. We go in, entertain the brats for a bit, then stuff the presents in our oversized pants. Easy peasy." "I have a bad feeling about this." "Trust me, it'll be a piece of cake. Pun absolutely intended." "Fine, but if this goes south, I'm blaming you." "Deal. Now, let's practice our juggling."
id
"gen-LzD5sgoykYxLoJV0Ae8yNTBXuFar"
total_cost
0.005175
upstream_id
"msg_01MDGZK5snYupdH9vYwhxGDx"
created_at
"2024-07-19T15:37:10.739749+00:00"
7/19/2024, 3:37:10 PM
model
"anthropic/claude-3.5-sonnet:beta"
app_id
182717
streamed
false
cancelled
false
provider_name
"Anthropic"
latency
14421
moderation_latency
(null)
generation_time
0
finish_reason
"end_turn"
tokens_prompt
68
tokens_completion
284
native_tokens_prompt
75
native_tokens_completion
330
num_media_prompt
(null)
num_media_completion
(null)
origin
"https://ncbench.com/"
usage
0.005175
Evaluation details
Result
Evaluator
Details
Meta Data
27.17%
Matches word count
n/a
neededClean
false
words
181
36.79%
Dialogue to Total Word Ratio
Ratio: 100.00%, Deviation: 10.00%
neededClean
false
wordsTotal
184
wordsDialogue
184
31.9769%