NC Bench
Overview
About
Tests
Models
Model
Llama 3.2 90B (Vision)
Test
Dialogue tags
Scenario
Write 500 words with 70% dialogue
Run ID
2024-09-26-01J8Q0QJ3P4RCQ65TTV5MHFKYW
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly poured over the blueprints of the children's play center, her brow furrowed in concentration. Bill lounged on the couch beside her, munching on a bag of potato chips. "Alright, so we've got the entrance covered," Molly said, tapping her finger on the diagram. "We'll pose as a couple dropping off a kid for the party. Once we're in, we make a beeline for the presents." "Got it," Bill said, his mouth full of chips. "And then we...?" "We snatch as many presents as we can carry and make a run for it," Molly said, her eyes shining with excitement. "Simple plan," Bill said, nodding. "I like it. But what about the whole 'children's party' thing? I'm not exactly good with kids. What if they start crying or something?" "Oh, come on, Bill," Molly said, rolling her eyes. "We'll be in and out before the sugar rush even hits. Besides, I've got an entire stash of 'emergency lollipops' for just such an occasion." Bill snorted. "Emergency lollipops? You sound like a kidnapper." "Hey, hey, no jokes about kidnappers," Molly said, grinning. "We're professionals. We're just...redistributing the presents. For a good cause." "What cause is that, exactly?" Bill asked, raising an eyebrow. "Our own personal 'Get Rich Quick' scheme, of course," Molly said, laughing. "Now, let's get to the fun part – deciding which presents to steal." Bill rubbed his hands together. "Ooh, can we steal the ones with the biggest bows? I love a good bow." "Absolutely not," Molly said, shooting him a stern look. "We're going for the high-ticket items, Bill. The stuff with actual cash value. None of that useless doll or stuffed animal nonsense." "Aw, come on," Bill whined. "But think of all the adorable little stuffed animals we could snatch! We could sell them on the black market for a small fortune!" "I highly doubt anyone's looking for stolen teddy bears on the black market," Molly said, sighing. "Besides, I've got my heart set on that giant Lego set. I can already see the bucks rolling in." Bill pouted, but Molly just laughed. "Hey, someone's got to keep this operation running, and it's not going to be the stuffed animals, no matter how cute they are." "So, what's our cover story going to be?" Bill asked, popping another chip into his mouth. "Well, since we're posing as a couple, I think we should really commit to it," Molly said, a mischievous glint in her eye. "I'll play the nagging girlfriend, and you can be the clueless, slightly-sputtering boyfriend." "Oh, man," Bill groaned. "Do I have to carry a Starbucks cup, too?" "Perfection," Molly cooed, beaming at him. "You're going to make the perfect ' Hipster Dad Type, circa 2023.'" Bill grumbled, but Molly just grinned, consulting the blueprint once more. "Time to put on our best 'parent faces' and get to work..."
id
"gen-mNuoGh8Lv3OPeMWuU1EU4KtFFEd9"
total_cost
0.0006777
upstream_id
"6e11aa33-fae2-4029-a5e5-ab0f27c44c8a"
created_at
"2024-09-26T11:16:37.104185+00:00"
Thursday, September 26th, 2024 at 11:16:37 AM GMT+00:00
model
"meta-llama/llama-3.2-90b-vision-instruct"
app_id
182717
streamed
true
cancelled
false
provider_name
"Fireworks"
latency
283
moderation_latency
(null)
generation_time
19442
finish_reason
"stop"
tokens_prompt
68
tokens_completion
681
native_tokens_prompt
71
native_tokens_completion
682
native_tokens_reasoning
(null)
num_media_prompt
(null)
num_media_completion
(null)
origin
"https://ncbench.com/"
usage
0.0006777
Evaluation details
Result
Evaluator
Details
Meta Data
3.6234%
Matches word count
n/a
neededClean
false
words
476
65.6878%
Dialogue to Total Word Ratio
Ratio: 72.90%, Deviation: 2.90%
neededClean
false
wordsTotal
476
wordsDialogue
347
34.6556%