NC Bench
Overview
About
Tests
Models
Model
Claude Sonnet 4.5
Test
Dialogue tags
Scenario
Write 500 words with 70% dialogue
Run ID
2026-02-08-01KGYZ4Q60Z66VNKZJGHGT0Q8V
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly crouched behind the minivan and peered around the bumper at the bouncy castle in the backyard. "This is ridiculous," Bill whispered, adjusting his fake mustache. "We're really doing this?" "The intel is solid. That kid's parents are loaded. Those presents are worth at least five grand, maybe more." "It's a six-year-old's birthday party, Molly." "So what? You suddenly grow a conscience?" Molly pulled out her phone and scrolled through photos. "Look at this registry. A junior Tesla. An actual pony experience package. Jewelry for a child." "But there's a clown." "What?" "There's a clown in there. I can see him through the fence. You know how I feel about clowns." Molly grabbed Bill's collar. "We've stolen from museums, mansions, and that one time, a moving train. You're telling me a guy in face paint scares you?" "It's the shoes. Those shoes are unnatural." "Focus! Here's the plan. I'll go in through the side gate pretending to be someone's aunt. You create a distraction at the front." "What kind of distraction?" "I don't know. Sing Happy Birthday really loud. Release a dog. Use your imagination." Bill frowned. "Do I look like someone who has a dog in their pocket?" "You literally have lock picks in your sock. Figure it out." Molly checked her watch. "We've got a ten-minute window during cake time. That's when everyone will be inside." "Won't they notice a strange woman hauling bags of presents out the gate?" "That's why you're the distraction. Keep them busy for three minutes. That's all I need." "Three minutes with a clown." "Bill!" "Fine, fine. But if he makes balloon animals, I'm out." "Deal. Now give me your jacket." "Why?" "I need to look more mom-like. That leather screams criminal." Bill reluctantly handed over his jacket. "This is Italian leather. If you get cake on it, we're done." "We're stealing from a child's party. I think we crossed the line of moral superiority already." Molly shrugged on the jacket. "How do I look?" "Like someone who definitely doesn't belong at a kid's party." "Perfect. That describes literally every adult at these things." Molly started toward the gate, then turned back. "Remember, three minutes. And Bill?" "Yeah?" "Try not to punch the clown." "No promises." Molly disappeared around the corner. Bill took a deep breath, straightened his fake mustache, and wondered for the hundredth time why he ever became a thief. A child shrieked with laughter from the backyard. The clown honked a horn. Bill really hated his job sometimes. He pulled out his phone and found a dog barking sound effect. Close enough.
created_at
"2026-02-08T15:50:34.164873+00:00"
Sunday, February 8th, 2026 at 3:50:34 PM GMT+00:00
model
"anthropic/claude-4.5-sonnet-20250929"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1814
moderation_latency
138
generation_time
13960
tokens_prompt
61
tokens_completion
589
native_tokens_prompt
75
native_tokens_completion
689
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.01056
router
(null)
provider_responses
0
id
"bae9d70b-cd02-4fb0-ae14-f043dc1097c2"
status
200
is_byok
false
latency
1814
endpoint_id
"d8bee848-f020-454a-b2be-4a7ce868f99f"
provider_name
"Amazon Bedrock"
model_permaslug
"anthropic/claude-4.5-sonnet-20250929"
api_type
"completions"
id
"gen-1770565819-EycKaqrwLMkLdfh2Uurg"
upstream_id
"bae9d70b-cd02-4fb0-ae14-f043dc1097c2"
total_cost
0.01056
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
Evaluation details
Result
Evaluator
Details
Meta Data
0.0000%
Matches word count
n/a
neededClean
false
words
435
74.7277%
Dialogue to Total Word Ratio
Ratio: 72.41%, Deviation: 2.41%
neededClean
false
wordsTotal
435
wordsDialogue
315
37.3638%