NC Bench
Overview
About
Tests
Models
Model
Claude 3.7 Sonnet
Test
Dialogue tags
Scenario
Write 500 words with 70% dialogue
Run ID
2025-02-24-01JMX29NYXBW88Y68JZ18ADGWP
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The coffee shop was quiet, save for the gentle hum of the espresso machine and the occasional clink of a spoon against ceramic. Molly drummed her fingers on the table, her eyes darting from the door to her watch and back again. Bill finally sauntered in, fifteen minutes late, sliding into the seat across from her with a broad smile. "You're late," Molly hissed. "Sorry, traffic was murder. Plus, I was casing the neighborhood. You know, doing actual preparation for the job." Bill reached over and stole a bite of Molly's muffin. Molly swatted his hand away. "So? What's the situation with the Henderson kid's birthday party?" "Goldmine." Bill leaned forward, lowering his voice. "We're talking at least thirty presents. High-end stuff too. These are Westside parents trying to outdo each other. Gaming consoles, tablets, designer clothes. One kid's bringing a drone." "A drone? For a six-year-old?" "Seven. And yes. Rich people, Mol. Different species entirely." Molly sipped her coffee, considering. "Security?" "Minimal. Just the usual suburban setup. Back door will be unlocked during the party with all the kids running in and out to the bounce house." "There's a bounce house?" "Focus, Molly. Yes, there's a bounce house. There's also a cotton candy machine, a magician, and a small petting zoo." "Jesus," Molly whispered. "For a seven-year-old?" "I told you. Different species." Bill grabbed another piece of her muffin. "The presents will be in the living room. Parents usually dump gifts on a table and then go drink mimosas in the kitchen." "How do you know so much about children's parties?" "I have nephews. Lots of nephews." Molly narrowed her eyes. "You don't have any siblings." "Fine. I watched YouTube videos of kid's parties. Research, Mol. It's called research." "So what's the plan? We can't just walk in." Bill grinned. "Actually, we can. I've got us jobs as part of the entertainment." "What? No. No, no, no." "Yes, yes, yes. You're now a face painter, and I'm your assistant." Molly leaned back in her chair. "I can't paint faces." "You've got two days to learn. I've got us the outfits and everything." "This is ridiculous. We're not actual clowns, Bill." "Never said clowns. Just... colorful assistants to the party. We do our bit, locate the presents, and when everyone's watching the magician pull a rabbit out of something, we make our move." "And how exactly do we get thirty presents out unnoticed?" "Already handled. I've got a catering van. We're delivering the cake, remember?" Molly shook her head. "This is insane. We're stealing from children." "Rich children who'll get replacement gifts within twenty-four hours. Come on, Mol. Easy money." "Fine. But I'm not wearing a red nose." "Wouldn't dream of it." Bill stole the last piece of her muffin. "Although the rainbow wig is non-negotiable." Molly glared at him. "I hate you." "You love me. And you'll love your cut of the haul even more."
id
"gen-1740437248-FKbxNuZ1BVokJtbrSQjk"
upstream_id
"msg_01FrGGDC7d6Q5B9fjmCJGwmX"
total_cost
0.01239
cache_discount
(null)
provider_name
"Anthropic"
created_at
"2025-02-24T22:47:42.746765+00:00"
Monday, February 24th, 2025 at 10:47:42 PM GMT+00:00
model
"anthropic/claude-3-7-sonnet-20250219:beta"
app_id
182717
streamed
true
cancelled
false
latency
537
moderation_latency
(null)
generation_time
13510
tokens_prompt
68
tokens_completion
695
native_tokens_prompt
75
native_tokens_completion
811
native_tokens_reasoning
0
num_media_prompt
(null)
num_media_completion
(null)
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.01239
Evaluation details
Result
Evaluator
Details
Meta Data
97.6276%
Matches word count
n/a
neededClean
false
words
493
33.9998%
Dialogue to Total Word Ratio
Ratio: 74.65%, Deviation: 4.65%
neededClean
false
wordsTotal
493
wordsDialogue
368
65.8137%