NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2024-11-04-01JBW92DSDQREAPVPNYMQMTVBW
Run Details
Model:
Claude 3.5 Haiku
Test:
Dialogue tags
Scenario:
Write 200 words with 90% dialogue
Duration:
6.0s
Cost:
$0.001248
Evaluation summary
Score
Evaluator
Details Link
9.61%
Matches word count
(details)
71.85%
Dialogue to Total Word Ratio
(details)
40.7307%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
"This is the dumbest heist we've ever planned," Bill muttered, adjusting his black ski mask. Molly rolled her eyes. "You said that about the retirement home job, and we walked out with three vintage Rolexes." "Those were watches. This is a children's birthday party. We're stealing wrapped presents from five-year-olds." "Eight-year-olds. Big difference." She sketched a quick diagram on a napkin. "Birthday party at the community center. Bounce house as a distraction. Parents will be clustered near the cake." Bill sighed. "And how exactly are we getting past the parents?" "Disguises. I've got two parent volunteer vests. We walk in like we belong." Molly tapped her sketch. "Presents are usually stacked on that big table near the gift table. Quick grab, quick exit." "This feels wrong." "Feels like easy money. Who's going to report stolen birthday presents?" "Children. Crying children." "Professional thieves don't have guilt, Bill." He adjusted his mask again. "I'm definitely going to hell." "We'll take a scenic route," Molly grinned. "Now, help me count how many gift bags we can stuff in these duffel bags."
total_cost
0.001248
Evaluation details
Result
Evaluator
Details
Meta Data
9.61%
Matches word count
n/a
neededClean
false
words
178
71.85%
Dialogue to Total Word Ratio
Ratio: 82.42%, Deviation: 7.58%
neededClean
false
wordsTotal
182
wordsDialogue
150
40.7307%