NC Bench
Overview
About
Tests
Models
Model
Claude 3.7 Sonnet
Test
Dialogue tags
Scenario
Write 500 words with 70% dialogue
Run ID
2025-02-24-01JMX17TDD6K5BBWT6E8SFP6QE
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
"So, explain the plan to me one more time," Bill said, rubbing his temples. "We're going to rob a kid's birthday party? Like, with actual children present?" Molly rolled her eyes, adjusting her black beanie in the rearview mirror. "Not just any kid's birthday party. Little Timmy Wellington's eighth birthday extravaganza. His father is that tech CEO who gives his kid solid gold Legos." "And how exactly do you know this?" "Facebook event. Public. Amateur move." Molly flipped through photos on her phone. "Look at last year's haul. Designer clothes, three gaming consoles, an actual miniature Tesla for children." Bill whistled. "That's... impressive." "Rich people, man." Molly tossed her phone onto the dashboard. "So here's how we play it. The party's at Wellington Manor from two to five. We go in at three-thirty, right when they're doing the scavenger hunt in the garden." "As what? Clowns? Because I'm not doing the clown thing again after Cincinnati." "No, you idiot. As caterers. I already swiped the uniforms." She patted the garment bag on the backseat. "The present table is in the east wing. We go in through the service entrance, load everything into our catering bins, and walk right out." Bill squinted suspiciously. "That sounds way too simple." "Sometimes simple works. Besides, who suspects theft during a children's party? Everyone's too busy making sure little Timmy doesn't drown in the chocolate fountain." "There's a chocolate fountain?" "Focus, Bill!" Bill sighed. "What about security?" "Two guards. One stays at the main gate, the other patrols. But according to the schedule, he takes a smoke break at 3:45 behind the pool house." "You have the guard's schedule? How did you—" "Don't ask questions you don't want answers to," Molly interrupted. "Oh, and we'll need to be quick about it. They're bringing in the cake at four, which means everyone comes back inside." "Fine. But I'm not stealing any presents that look homemade or sentimental. I have standards." Molly laughed. "Trust me, nothing at this party was made with love. The gift registry included a junior Rolex." "A junior Rolex? Is that even a thing?" "It is when your daddy owns half of Silicon Valley." Bill leaned back against the headrest. "What if we get caught?" "We say we're rearranging the presents for the big reveal. Rich people love when the help shows initiative." "And if that doesn't work?" "Then we run like hell." Molly started the engine. "Look, are you in or out? Because the Wellington party represents approximately sixty thousand dollars in resalable merchandise sitting on a table guarded by exactly nobody." Bill pulled on his gloves. "I'm in. But if there's a clown there, I'm leaving immediately." "The only clowns will be those parents spending college funds on toys their kid will forget by Tuesday." Molly pulled away from the curb. "Now put on your catering outfit. You're about to become Jeeves the hors d'oeuvre guy." "I hate this job sometimes." "No, you don't." "No," Bill admitted with a grin. "I really don't."
id
"gen-1740436138-ugTlVsIhHNQGaPAJmKWY"
upstream_id
"msg_019bwtW5raX9EMLA4UYpbhT2"
total_cost
0.01218
cache_discount
(null)
provider_name
"Anthropic"
created_at
"2025-02-24T22:29:14.413+00:00"
Monday, February 24th, 2025 at 10:29:14 PM GMT+00:00
model
"anthropic/claude-3-7-sonnet-20250219:beta"
app_id
182717
streamed
true
cancelled
false
latency
296
moderation_latency
(null)
generation_time
14999
tokens_prompt
68
tokens_completion
686
native_tokens_prompt
75
native_tokens_completion
797
native_tokens_reasoning
0
num_media_prompt
(null)
num_media_completion
(null)
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.01218
Evaluation details
Result
Evaluator
Details
Meta Data
99.9190%
Matches word count
n/a
neededClean
false
words
503
0.0015%
Dialogue to Total Word Ratio
Ratio: 84.89%, Deviation: 14.89%
neededClean
false
wordsTotal
503
wordsDialogue
427
49.9603%