NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2024-07-19-01J35T98Z23SX826362J5J8RHA
Run Details
Model:
GPT-4o Mini (temp=1)
Test:
Dialogue tags
Scenario:
Write 200 words with 10% dialogue
Duration:
3.7s
Cost:
$0.000183
Evaluation summary
Score
Evaluator
Details Link
99.98%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
49.9920%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 10% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly crouched behind a giant inflatable unicorn, peering through her binoculars at the chaotic scene of the birthday party. “Bill, can you believe this? They’ve got enough gifts to stock a small store!” Bill grinned, adjusting his mask. “I’ve always wanted to be a toy millionaire! What’s the plan?” “Simple. We wait until they’re distracted with the clown. No one can resist a guy balloon animals.” “Right. Clowns!” Bill winced. “Why do they always have to be so creepy? I’d rather face a guard dog!” “Focus, Bill! While they’re screaming for a giraffe balloon, we’ll make our move.” Molly gestured toward a long table piled high with brightly wrapped gifts. “When I give the signal, we grab the biggest ones, preferably the ones with ‘To Tommy’ tags.” “I hope Tommy’s not some kind of martial arts prodigy,” he joked. “I’m more of a soft-heist kind of thief.” Molly chuckled. “Don’t worry. We’ll be in and out like ninjas!” “Or chubby squirrels,” Bill added, snickering. “So, operation ‘Present Predators’ is a go?” “Absolutely! Now, shhh! Here comes the clown!” As the misplaced joy of balloons overtook the party, the duo dashed toward the treasures, ready for their second childhood.
total_cost
0.00018345000000000001
Evaluation details
Result
Evaluator
Details
Meta Data
99.98%
Matches word count
n/a
neededClean
false
words
198
0.00%
Dialogue to Total Word Ratio
Ratio: 67.34%, Deviation: 57.34%
neededClean
false
wordsTotal
199
wordsDialogue
134
49.9920%