NC Bench
Overview
About
Tests
Models
Model
Claude Sonnet 4.5
Test
Dialogue tags
Scenario
Write 500 words with 50% dialogue
Run ID
2026-02-08-01KGYZ6BPBEAD47P8TABYAW1YE
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly pressed her back against the bouncy castle, peering around the inflatable turret at the mountain of wrapped presents stacked on the picnic table. "This is either our greatest job or our stupidest," Bill whispered, adjusting his fake mustache. "I can't decide which." "Greatest. Definitely greatest," Molly said. "Look at that haul. Those boxes are huge." "They're also surrounded by thirty screaming six-year-olds hopped up on cake and juice boxes." Molly pulled out a crumpled party invitation from her pocket, studying it like a tactical map. "The present opening is scheduled for three-fifteen. We've got twelve minutes." "Where did you even get that?" Bill asked. "Found it in a trash can outside the community center. Some kid named Brandon is having the birthday of a lifetime, and we're about to make it memorable for all the wrong reasons." Bill wiped sweat from his forehead. The afternoon sun was brutal, and his polyester clown costume wasn't helping. "Remind me why we're dressed like this again?" "Camouflage," Molly said, straightening her rainbow wig. "We blend right in. Parents hired entertainment. We are entertainment." "We don't have any entertainment skills." "Sure we do. We're about to make all these presents disappear. That's magic." A soccer ball bounced off Bill's oversized shoe. A small girl with pigtails ran up, stared at him, then burst into tears and ran away. "Great. We're terrifying children now," Bill muttered. "Focus," Molly hissed. "The mom just went inside. Probably getting more pizza. This is our window." "What about the dad?" Molly nodded toward a man in cargo shorts who had fallen asleep in a lawn chair, a beer balanced precariously on his stomach. "Out cold. All that supervision wore him out." "Okay, so what's the actual plan?" Bill asked. "You create a distraction. I'll grab the presents, load them into our van." "What kind of distraction?" "I don't know. You're a clown. Be clowny." "I hate you so much right now," Bill said. "You'll hate me less when we fence those presents. I saw at least three iPads in that pile." "How could you possibly know that?" "Gift bags from the Apple store, genius." Molly cracked her knuckles. "On three. One, two—" "Wait! What if they're not actually expensive presents? What if they're just big boxes?" Molly stared at him. "Bill, this is a party in the rich neighborhood. There's a bouncy castle shaped like a dragon. The cake has four tiers. Trust me." "I'm just saying, we could be risking arrest for a bunch of educational toys and sensible pajamas." "Do you want to stay poor forever?" "No." "Then move."
created_at
"2026-02-08T15:51:29.306139+00:00"
Sunday, February 8th, 2026 at 3:51:29 PM GMT+00:00
model
"anthropic/claude-4.5-sonnet-20250929"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1560
moderation_latency
143
generation_time
15351
tokens_prompt
61
tokens_completion
591
native_tokens_prompt
75
native_tokens_completion
689
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.01056
router
(null)
provider_responses
0
id
"8ac4c26d-51b8-4032-b4a1-1912c474e81f"
status
200
is_byok
false
latency
1559
endpoint_id
"d8bee848-f020-454a-b2be-4a7ce868f99f"
provider_name
"Amazon Bedrock"
model_permaslug
"anthropic/claude-4.5-sonnet-20250929"
api_type
"completions"
id
"gen-1770565873-p91fyukhKVAKXugzGXBl"
upstream_id
"8ac4c26d-51b8-4032-b4a1-1912c474e81f"
total_cost
0.01056
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
Evaluation details
Result
Evaluator
Details
Meta Data
0.0000%
Matches word count
n/a
neededClean
false
words
436
0.0000%
Dialogue to Total Word Ratio
Ratio: 68.35%, Deviation: 18.35%
neededClean
false
wordsTotal
436
wordsDialogue
298
0.0000%