NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2024-10-16-01JAB4RV6H06PB8X7RCVHFFAA3
Run Details
Model:
Ministral 8B
Test:
Dialogue tags
Scenario:
Write 200 words with 90% dialogue
Duration:
2.6s
Cost:
$0.000037
Evaluation summary
Score
Evaluator
Details Link
95.99%
Matches word count
(details)
94.86%
Dialogue to Total Word Ratio
(details)
95.4258%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly: "So, Bill, you're sure about this? A children's party? Aren't you the least bit squeamish?" Bill: "Nah, Molly. The only thing I find squeamish is the thought of missing out on a set of 'Jurassic World' action figures. Plus, we're not stealing from the kids, we're stealing from kids' parents who can afford to spend a fortune on toys." Molly: "And what makes you think there'll be any 'Jurassic World' action figures there?" Bill: "Because, Molly, I've been casing the joint. The mom's posted pictures on Instagram. She's got more 'Jurassic World' toys than a museum. We'll be rolling in toys, Molly. Rolling!" Molly: "And what about the parents? They're sure to notice their toys are gone." Bill: "Leave that to me. I've got a distraction planned. A 'Jurassic World' movie marathon. By the time the parents realize the toys are gone, we'll be long gone." Molly: "And if they have a security system?" Bill: "I'll be dressed as a dinosaur. What security system can resist that?" Molly: "I suppose that's one way to blend in, Bill." Bill: "Exactly, Molly. Now, let's get ready. We've got a party to... er, enjoy."
total_cost
0.0000366
Evaluation details
Result
Evaluator
Details
Meta Data
95.99%
Matches word count
n/a
neededClean
false
words
192
94.86%
Dialogue to Total Word Ratio
Ratio: 94.79%, Deviation: 4.79%
neededClean
false
wordsTotal
192
wordsDialogue
182
95.4258%