NC Bench
Overview
About
Tests
Models
Model
Llama 3.1 405B
Test
Dialogue tags
Scenario
Write 500 words with 30% dialogue
Run ID
2024-07-23-01J3G51PB7194XA1HS43B8TZKP
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly adjusted her cat eye glasses and scrutinized the Google Earth image of the suburban house on her laptop screen. Bill, lounging on the couch beside her, cracked his knuckles in preparation for the impending heist. "Alright, Bill, the Smith's kid is turning seven and the party is at two pm on Saturday. The presents are likely to be stacked in the living room," Molly said, her voice low and husky. "I love stealing from kids, it's like taking candy from a, well, you know," Bill chuckled, his dimples deepening in his weathered cheeks. "Shut up, Bill. This is a high-stakes operation. We're not just talking any presents, we're talking iPads, American Girl dolls, maybe even a puppy or two. We need to get in and out quickly," Molly warned, tucking her curly brown hair into a ponytail. "Sounds like a plan, Molly. But have you considered the party games? I don't want to get stuck in a round of Pin the Tail on the Donkey," Bill whined, running a hand over his shaved head. "Oh, for Pete's sake, Bill. You can't seriously be worried about party games. We're professional thieves," Molly scoffed, tapping her stiletto on the floor. "Hey, I resent that. I'm a highly trained party-crasher. I can spin a mean plate of nachos," Bill protested, puffing out his chest. "Whatever. We'll scope out the backyard and see if we can slip in unnoticed during the cake-cutting ceremony. If not, we'll have to improvise. I have a backup plan, involving a whoopee cushion and a clown suit," Molly announced, a sly grin spreading across her face. "A clown suit? Oh boy, we're in trouble. We should've just stuck to boosting electronics from Best Buy," Bill groaned, burying his face in his hands. "Hey, the kids love a good clown, and I make an amazing Bozo. Trust me, this will work. Now let's review the layout one more time," Molly said, dragging Bill back to the laptop. The two thieves leaned in, poring over the map of the house, searching for any possible vulnerabilities in the Smith's defenses. Bill snickered to himself as he spotted the bounce house in the backyard, no doubt filled with screaming, hyperactive kids. "Okay, Molly, you take care of the clown suit, and I'll start scoping out the neighborhood for any watchful eyes. But if we get caught, I'm blaming you and that crazy hair," Bill agreed, jumping off the couch to pace the living room. "Don't even think about it, Bill. I've already styled this mop of mine to distract from the duffel bag of loot we'll be hauling out. This heist is going to be smoother than a snake's behind," Molly promised, setting the laptop aside and slapping Bill's back in solidarity. With the plan set in motion, the two thieves set off to conjure up their respective disguises, confident they would walk away with a carload of treasure, all without getting caught in a web of sticky fingers and sugar highs.
id
"gen-tSVX1NhJ3QFQuoPD7idazRzaccIt"
total_cost
0.002211
upstream_id
"974862e5-921f-488b-b881-23a0380963e4"
created_at
"2024-07-23T15:59:04.146417+00:00"
Tuesday, July 23rd, 2024 at 3:59:04 PM GMT+00:00
model
"meta-llama/llama-3.1-405b-instruct"
app_id
182717
streamed
false
cancelled
false
provider_name
"Fireworks"
latency
30055
moderation_latency
(null)
generation_time
0
finish_reason
"stop"
tokens_prompt
102
tokens_completion
667
native_tokens_prompt
70
native_tokens_completion
667
num_media_prompt
(null)
num_media_completion
(null)
origin
"https://ncbench.com/"
usage
0.002211
Evaluation details
Result
Evaluator
Details
Meta Data
99.9190%
Matches word count
n/a
neededClean
false
words
503
0.0000%
Dialogue to Total Word Ratio
Ratio: 57.46%, Deviation: 27.46%
neededClean
false
wordsTotal
503
wordsDialogue
289
49.9595%