NC Bench
Overview
About
Tests
Models
Model
Claude 3.7 Sonnet
Test
Dialogue tags
Scenario
Write 500 words with 50% dialogue
Run ID
2025-02-24-01JMX2DEYP691ZBY999BEQDW5S
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
"Bill, tell me again why we're targeting a kid's birthday party?" Molly adjusted her binoculars, focusing on the balloon-festooned house across the street. "Because," Bill said, tapping a crude diagram drawn on the back of a pizza box, "kids get the good stuff. Brand new. Still in boxes. Easy to fence." Molly sighed, lowering the binoculars. "It feels wrong. Even for us." "Wrong? We stole a poodle last month!" "We returned the poodle," Molly countered. "With a bow. And a bath." Bill ran a hand through his thinning hair. "Look, the intel is solid. Twins turning eight. Loaded parents. Thirty kids attending. That's at least sixty presents." "What if they're just getting books? Or clothes?" Molly peered through the binoculars again. "Nobody's fencing superhero underwear, Bill." "Rich kids don't get underwear. They get tablets. Gaming consoles. Robot toys that cost more than our car." Bill gestured toward their rusted getaway vehicle. "Which isn't saying much, but still." "The timing seems tight," Molly said, checking her watch. "Cake at three. Presents at three-thirty." Bill nodded enthusiastically. "That's the beauty of it! Everyone's watching the birthday boy open presents in the living room while we slip in the back and clean out the gift table." "Boys. Twins, remember? And there's a clown, Bill. A clown." Molly shuddered. "I hate clowns." "The clown is our advantage! Nobody notices suspicious activity when there's a clown around. Clowns are inherently suspicious." Molly rubbed her temples. "So what's the actual plan? Beyond 'grab stuff and run'?" "I go in through the kitchen door during the magic show—" "The clown does magic too?" Molly interrupted. "Focus, Molly! I grab the haul while you idle out front. Twenty seconds and I'm out." "And the dog? They have a German Shepherd, Bill." Bill proudly pulled something from his pocket. "Steak. Drugged with half a Benadryl. Knocks my nephew right out during thunderstorms." "You drugged steak for a dog? That's... actually smart." "Thank you." Bill grinned. "I can be smart." "Occasionally," Molly conceded. "What about the alarm system?" "Kids' party. Door's gonna be open for guests coming and going." Molly considered this. "What if we get caught? By a bunch of sugar-hyped eight-year-olds and their parents?" "We won't." "But if we do?" Bill shrugged. "We pretend we're part of the entertainment? I can juggle." "No, you can't." "They don't know that." Molly sighed again, deeper this time. "This is ridiculous." "It's brilliant," Bill corrected. "Wait—they're bringing out the cake! Action stations!" "We're not military, Bill." "Just start the car," Bill said, pulling a ski mask over his face. "And if I'm not back in three minutes—" "Wait for you anyway because I'm not leaving you to get beaten up by suburban parents?" Bill paused. "Yeah. That." He nodded once, then slipped out of the car toward the birthday party, humming the Mission Impossible theme under his breath. Molly watched him go, already rehearsing the apology she'd inevitably be making to a room full of traumatized children.
id
"gen-1740437373-LBplD6DtMrq3fvx4No4Y"
upstream_id
"msg_015MuYvo6i8hXjSeKMUwYhrz"
total_cost
0.01317
cache_discount
(null)
provider_name
"Anthropic"
created_at
"2025-02-24T22:49:50.705817+00:00"
Monday, February 24th, 2025 at 10:49:50 PM GMT+00:00
model
"anthropic/claude-3-7-sonnet-20250219:beta"
app_id
182717
streamed
true
cancelled
false
latency
317
moderation_latency
(null)
generation_time
16000
tokens_prompt
68
tokens_completion
718
native_tokens_prompt
75
native_tokens_completion
863
native_tokens_reasoning
0
num_media_prompt
(null)
num_media_completion
(null)
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.01317
Evaluation details
Result
Evaluator
Details
Meta Data
99.9840%
Matches word count
n/a
neededClean
false
words
502
0.0000%
Dialogue to Total Word Ratio
Ratio: 70.92%, Deviation: 20.92%
neededClean
false
wordsTotal
502
wordsDialogue
356
49.9920%