NC Bench
Overview
About
Tests
Models
Model
Claude 3.7 Sonnet
Test
Dialogue tags
Scenario
Write 500 words with 30% dialogue
Run ID
2025-02-24-01JMX0R1SDAME7BVB9STBCX6ZC
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The moonlight glinted off chrome bumpers in the vacant parking lot. Molly drummed her fingers against the steering wheel, her eyes fixed on the brightly decorated community center across the street. A banner fluttered in the evening breeze: "Happy 8th Birthday, Tyler!" Bill slouched in the passenger seat, fiddling with a party hat he'd swiped from a dollar store. "Remind me again why we're targeting a kid's birthday party?" "Because," Molly said, reaching into the glove compartment for her binoculars, "parents go all out these days. Designer clothes, electronics, gift cards. One good haul from a rich kid's party equals a month of convenience store jobs." Bill snorted. "Still feels low, even for us." "Lower than the time you stole that seeing-eye dog?" "That was an accident! I thought it was just a regular dog with a fancy leash." Molly raised the binoculars to her eyes, scanning the venue. "Look at that pile forming already. Video games, something that looks like a drone box... bingo." Bill leaned over, his breath smelling of cheese puffs. "What's the plan? We can't exactly waltz in there. We don't have a kid to drop off." "We're not dropping off anything," Molly said, lowering the binoculars. "We're picking up. Parents don't know each other at these things. We slip in during cake time claiming we have to leave early and need to grab our present for Tyler." "And if someone questions us?" "We say we're Aunt Susan's neighbors. Nobody knows Aunt Susan." Bill considered this, twirling the party hat on his index finger. "There's always an Aunt Susan." "Exactly." "What if there isn't an Aunt Susan at this particular party?" Molly shot him an exasperated look. "Then we'll say Uncle Bob or Cousin Karen or whoever. The point is, birthday parties are chaos. Nobody's keeping track." A minivan pulled into the lot, disgorging a harried mother and a child clutching a wrapped box with a massive bow. "Clock's ticking," Bill observed. "Lots of witnesses showing up." Molly reached into the backseat for her bag. "That's why we go in separate. I'll case the gift table, you scout the exits. Twenty minutes tops." "I still think this is weird," Bill muttered, but he was already reaching for his jacket. "Weirder than the retirement home bingo night?" "That was different. Those old folks were gambling addicts." Molly checked her reflection in the rearview mirror. "Look, we agreed after the museum fiasco that we'd start smaller. This is smaller." "It's a kid's birthday party." "It's a learning opportunity." She handed him a wrapped box. "Our cover gift. It's empty, obviously." Bill weighed it in his hand. "What if we get caught?" Molly grinned, reaching for the door handle. "Then we sing 'Happy Birthday' really loud and run like hell." The party lights flickered across their faces as they stepped out into the night, two shadows moving toward the laughter and music emanating from the community center.
id
"gen-1740435622-7ldc4a4kkMwGv9Er6tMS"
upstream_id
"msg_011VYw2fdWuch23FXgQSz1s5"
total_cost
0.01149
cache_discount
(null)
provider_name
"Anthropic"
created_at
"2025-02-24T22:20:35.948973+00:00"
Monday, February 24th, 2025 at 10:20:35 PM GMT+00:00
model
"anthropic/claude-3-7-sonnet-20250219:beta"
app_id
182717
streamed
true
cancelled
false
latency
303
moderation_latency
(null)
generation_time
13225
tokens_prompt
68
tokens_completion
663
native_tokens_prompt
75
native_tokens_completion
751
native_tokens_reasoning
0
num_media_prompt
(null)
num_media_completion
(null)
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.01149
Evaluation details
Result
Evaluator
Details
Meta Data
86.3803%
Matches word count
n/a
neededClean
false
words
489
0.0000%
Dialogue to Total Word Ratio
Ratio: 58.49%, Deviation: 28.49%
neededClean
false
wordsTotal
489
wordsDialogue
286
43.1902%