NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2024-07-19-01J35SRSB0FWRX1A9VVM4G7Z69
Run Details
Model:
GPT-4o Mini (temp=1)
Test:
Dialogue tags
Scenario:
Write 200 words with 90% dialogue
Duration:
4.1s
Cost:
$0.000182
Evaluation summary
Score
Evaluator
Details Link
99.98%
Matches word count
(details)
36.79%
Dialogue to Total Word Ratio
(details)
68.3860%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
“Molly, you’re sure about this, right?” “Of course, Bill! It’s the perfect setup. They’ll be too busy with games and cake!” “Right, but you realize it’s a kid’s party? Do you even like kids?” “Not particularly. But I’ve got my eye on that giant teddy bear! It’s a limited edition!” “A bear? We could nab electronics, cash… what are we doing with a teddy?” “Think about it! It’s the ultimate decoy! Everyone loves teddy bears at parties; they won’t even notice us!” “I just have this vision of us surrounded by glitter and balloons.” “Relax! We’ll wear black! Black blends in with, uh… party decoration!” “Sure. When I think ‘stealth,’ I think birthday streamers. Can we just stick to the plan?” “Fine! We go in, grab the presents, and leave, but we have to be quick!” “And what about the parents?” “Trust me, they’ll be too busy arguing over who brought the gluten-free cupcakes to notice us.” “You’re telling me we’re going to steal from families celebrating life? Really?” “Just imagine the thrill, Bill! The adrenaline! And that teddy… worth it, right?” “Whatever, but if I trip on a party hat, I’m blaming you.” “Deal! Now let’s get our party hats on!”
total_cost
0.00018225
Evaluation details
Result
Evaluator
Details
Meta Data
99.98%
Matches word count
n/a
neededClean
false
words
202
36.79%
Dialogue to Total Word Ratio
Ratio: 100.00%, Deviation: 10.00%
neededClean
false
wordsTotal
203
wordsDialogue
203
68.3860%