NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2024-11-04-01JBW8S5PRBWE80CYSVW9NEVKE
Run Details
Model:
Claude 3.5 Haiku
Test:
Dialogue tags
Scenario:
Write 200 words with 90% dialogue
Duration:
4.4s
Cost:
$0.001228
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
53.36%
Dialogue to Total Word Ratio
(details)
26.6800%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
"You really think this is our best target?" Bill asked, adjusting his fake party planner glasses. Molly rolled her eyes. "Trust me. Rich suburban kid's birthday party. Wrapped presents everywhere." "We're stealing from eight-year-olds?" Bill's voice cracked. "Not stealing. Redistributing gift wealth," Molly smirked. "These kids get iPads and gaming systems they don't even appreciate." Bill scratched his chin. "Sounds like rationalization." "Look, the parents hired a magician. Everyone will be distracted. I've got the catering company uniform, you've got the cake delivery disguise. We're golden." "What if a parent recognizes us?" "Who remembers cake delivery guys? We'll be in and out faster than you can say 'happy birthday'." Bill sighed. "This feels wrong." "Wrong would be leaving perfectly good Xbox bundles and LEGO sets unattended," Molly retorted. "Plus, the insurance will cover it." "That's not how insurance works." "Shh. Do you want free electronics or not?" Bill hesitated, then nodded. "Fine. But I'm not touching anything with glitter." Molly grinned. "Amateur."
total_cost
0.001228
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
162
53.36%
Dialogue to Total Word Ratio
Ratio: 81.10%, Deviation: 8.90%
neededClean
false
wordsTotal
164
wordsDialogue
133
26.6800%