NC Bench
Overview
About
Tests
Models
Model
Llama 3.2 11B (Vision)
Test
Dialogue tags
Scenario
Write 500 words with 70% dialogue
Run ID
2024-09-26-01J8PYW3GZKFTVDCPDY72KE2KJ
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The sun-drenched parking lot of the Bouncy Castle Extravaganza stretched out before them like a haven for errant helium balloons. Molly adjusted her bright pink wig, a questionable tribute to the sugar-fueled chaos within. Beside her, Bill checked his watch for what felt like the hundredth time. "You're starting to annoy me with that." Molly said, eyes narrowing behind her oversized sunglasses. "I'm just making sure we're not late for the candy buffet." Bill clarified, not skipping a beat. Molly raised an eyebrow. "There's no candy buffet. This is a cash-only affair. Besides, we're here for the real treasure: the presents." Bill's grin widened as he surveyed the chaos. "I never get tired of kids' parties. The predictability is amazing." Molly snorted, shouldering her backpack. "Predictable, yes. Easy, no. Those parents are always so... vigilant." "Vigilant?" Bill repeated, mock confusion etched on his face. "You mean like the time they caught us at the Fairy Princess Tea Party?" Molly shot him a look. "Don't remind me. I still have the bruise from that tiara-shaped wrestling match." Bill chuckled. "Hey, it was a close call. We almost got away with the tiara itself." Molly snorted. "The tiara? A child's titter of excitement is all we managed to escape with that day." Bill shrugged. "Hey, it was a good souvenir. And we did get out undetected." "A near-miss with a toddler's sticky hug and some very angry parents does not count as getting out undetected." Molly's attention turned to the party's layout. "Alright, let's get the lay of the land. We have three security cameras, each with a distinct blind spot...which I've cleverly marked on this napkin." Bill took the napkin, eyes scanning the hastily drawn diagram. "Robby's 'security expert' skills are truly an asset, Molly." Molly glared at him. "Hey, it's a legitimate life skill I honed at my summer camp as a kid. Don't knock it till you've successfully sneaked a bag of s'mores past the camp counselors." Bill whistled, impressed. "S'mores? No wonder I'm impressed by your security expertise. You've clearly refined your skills through years of clandestine camping activities." Molly pinched his arm, not hard enough to hurt but enough to annoy. "We're here to steal presents, Bill. Focus." As they crept closer, the cacophony within grew louder, the scent of sugary drinks and stale pizza wafting on the breeze like a siren song. Bill leaned in, his voice barely above a whisper. "You know, Molly, this might just be the steal of the year. Eighty percent chance we get in, grab the loot, and out before anyone even notices." Molly gave him a look of skepticism. "Fifty-five percent, yourself. I'll take seventy." Bill held up his hands. "Hey, a guy can dream."
id
"gen-hB6xbgXslIYWG6N1jtPnZkVi4HQD"
total_cost
0.0001446
upstream_id
"93b3b38b-1d47-47e4-ae4d-bd4aa5e02efc"
created_at
"2024-09-26T10:43:54.779642+00:00"
Thursday, September 26th, 2024 at 10:43:54 AM GMT+00:00
model
"meta-llama/llama-3.2-11b-vision-instruct"
app_id
182717
streamed
true
cancelled
false
provider_name
"Fireworks"
latency
160
moderation_latency
(null)
generation_time
5475
finish_reason
"stop"
tokens_prompt
68
tokens_completion
651
native_tokens_prompt
71
native_tokens_completion
652
native_tokens_reasoning
(null)
num_media_prompt
(null)
num_media_completion
(null)
origin
"https://ncbench.com/"
usage
0.0001446
Evaluation details
Result
Evaluator
Details
Meta Data
0.0000%
Matches word count
n/a
neededClean
false
words
461
1.1188%
Dialogue to Total Word Ratio
Ratio: 60.52%, Deviation: 9.48%
neededClean
false
wordsTotal
461
wordsDialogue
279
0.5594%