NC Bench
Overview
About
Tests
Models
Model
Llama 3.2 11B (Vision)
Test
Dialogue tags
Scenario
Write 500 words with 30% dialogue
Run ID
2024-09-26-01J8PYX3ZS9BV3V13C4D3XM2KB
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly pored over the map of the party venue, her eyes scanning the layout of the tables and chairs with a practiced intensity. Bill leaned against the wall, arms crossed, watching her with a raised eyebrow. "How did they manage to fit so many screaming kids into one room?" he asked, his tone dry. "They have a great sound system," Molly muttered, "and equally great earplugs. Trust me, I've researched this one." Bill snorted. "You're researching how to rob a children's party. You're a true professional." Molly shot him a look. "We don't get a lot of high-stakes gigs like this. We have to take what we can get." "Right. Because there's nothing more thrilling than stealing unwrapped toys from emotionally overwhelmed parents." Molly ignored the jab and pointed to the map. "See, the presents are set up on two long tables, right in the center of the dance floor. We'll have a clear shot at them –" "Clear shot, yeah, said the thieves who got caught by a five-year-old last year," Bill interrupted. Molly narrowed her eyes. "We've got a new strategy this time. We'll deploy the Distracto-Distractor." "The what-now?" "The device that plays laser beam sounds and blows party horns through the walls of an arbitrary pattern –" "Sounds suspiciously like someone poking holes in your plotter's 'Strategic Genius' file." Molly ignored him and handed Bill a small, homemade device with flashing lights and a loudspeaker grille. "T-this thing is gonna blow the kids' minds," Bill said, examining the device. "I like it." "The Distracto-Distractor is our ace in the hole. With that, we'll be able to create a reasonable facsimile of chaos while we swoop in and grab all the presents." Bill snorted. "Reasonable chaos? You're a one-woman war zone –" "We need precision for this, Bill. I know it doesn't sound glamorous –" "Glamorous? Are you kidding? Stealing toys from tiny human factories? The ultimate thrill ride." Molly's gaze shot up, and for a moment, their eyes locked. Bill hastily retreated to a safer space within his facial expression, a mask of practiced nonchalance that only faltered for an instant. Molly refocused, surveying the room's blueprints once more. "Timing will be everything. As soon as the cake comes out, it's our window. We execute in three minutes. Then we're out." "Sounds like a winner," Bill muttered, tucking the Distracto-Distractor into his pocket. Molly straightened, gathering her supplies. "Alright. We put the plan into action tonight. And if I get caught, remember you're my lawyer." Bill produced a blank business card from his wallet. "I have one left over from a divorce settlement. Congratulations, you're officially broke." The soft murmur of the children's party, conducted in a few hesitant octavers outside the meeting room, broke the tension; a fragile pause hung like held breath as Molly gave Bill a flat, dismissive glance before turning back to her briefing.
id
"gen-w0YKrwpkFAmt7UBOmew8prxkjuca"
total_cost
0.0001476
upstream_id
"09d2bbee-4d2e-4427-b4f1-25ca2e2bb21b"
created_at
"2024-09-26T10:44:28.152213+00:00"
Thursday, September 26th, 2024 at 10:44:28 AM GMT+00:00
model
"meta-llama/llama-3.2-11b-vision-instruct"
app_id
182717
streamed
true
cancelled
false
provider_name
"Fireworks"
latency
143
moderation_latency
(null)
generation_time
5477
finish_reason
"stop"
tokens_prompt
68
tokens_completion
666
native_tokens_prompt
71
native_tokens_completion
667
native_tokens_reasoning
(null)
num_media_prompt
(null)
num_media_completion
(null)
origin
"https://ncbench.com/"
usage
0.0001476
Evaluation details
Result
Evaluator
Details
Meta Data
90.4837%
Matches word count
n/a
neededClean
false
words
490
0.0000%
Dialogue to Total Word Ratio
Ratio: 60.41%, Deviation: 30.41%
neededClean
false
wordsTotal
490
wordsDialogue
296
45.2419%