NC Bench
Overview
About
Tests
Models
Model
Claude Sonnet 4.5
Test
Dialogue tags
Scenario
Write 500 words with 50% dialogue
Run ID
2026-02-08-01KGYYCZ5GFW2N9DFA2M9B916B
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly pressed her face against the fence, peering through the knothole at the bouncing castle and screaming children beyond. "Are you absolutely certain about this?" Bill asked, adjusting his fake mustache for the third time. "When have I ever steered you wrong?" "Last month. The poodle show incident." "That was ninety percent your fault," Molly said, pulling back from the fence. "Besides, my intel says there's at least two grand worth of toys in there. Maybe three if the birthday girl's grandparents showed up." Bill shifted his weight, causing the enormous teddy bear costume head tucked under his arm to slip. He caught it before it hit the ground. "Remind me why I have to be the entertainment?" "Because I'm coordinating the extraction, obviously." "You're sitting in the van." "Strategic positioning," Molly said, tapping her temple. She checked her watch. "The real Fuzzy Wuzzy should be stuck in traffic for another fifteen minutes. That's your window." "I can't believe parents pay two hundred dollars for someone to dance around in a bear suit." "Focus, Bill. You get in there, do your little dance, and while the kids are distracted, I'll back the van up to the present table." Bill pulled the bear head on, his voice now muffled. "This smells like someone died in here." "Probably someone's dignity. Now go." She watched him waddle toward the back gate, his furry posterior swaying with each step. He looked back once, gave her a thumbs up, and disappeared around the corner. Molly jogged back to the van and climbed into the driver's seat, adjusting the rearview mirror to watch the yard. Through the fence slats, she could see Bill emerging into the party space. The children's screams reached a new pitch. Her phone buzzed. Bill's text read: "They love me." "Stay focused," she typed back. Three minutes passed. Four. She started the engine and began slowly reversing toward the fence, watching the mirror carefully. Another text: "Kid just threw up on my foot." "Occupational hazard. Status?" "Parents are all watching me. Present table is clear. But..." "But what?" "I feel bad." Molly's head hit the steering wheel. "Don't you dare go soft on me, Bill." "There's a kid in a wheelchair. Her present is wrapped in rainbow paper. I can see it from here." "Bill, we've been over this." "She looks really happy, Molly." Through the fence, Molly could see Bill's ridiculous bear form swaying to the Hokey Pokey. The children were jumping around him like puppies. "Fine," she texted. "Rainbow paper stays. Grab everything else." "Really?" "I'm not a monster. But Bill?" "Yeah?" "You're doing the next three jobs solo for making me feel feelings." "Deal. Wait, the real Fuzzy Wuzzy just pulled up." "Abort! Abort!" Bill came sprinting around the corner, bear head under his arm, empty-handed and running like his life depended on it.
created_at
"2026-02-08T15:37:38.126872+00:00"
Sunday, February 8th, 2026 at 3:37:38 PM GMT+00:00
model
"anthropic/claude-4.5-sonnet-20250929"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1468
moderation_latency
376
generation_time
15689
tokens_prompt
61
tokens_completion
644
native_tokens_prompt
75
native_tokens_completion
739
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.01131
router
(null)
provider_responses
0
id
"3b0bcf77-552b-4e36-aabc-4e46ed0594d2"
status
200
is_byok
false
latency
1468
endpoint_id
"d8bee848-f020-454a-b2be-4a7ce868f99f"
provider_name
"Amazon Bedrock"
model_permaslug
"anthropic/claude-4.5-sonnet-20250929"
api_type
"completions"
id
"gen-1770565041-tVLfCYVm7OucLhTqncsj"
upstream_id
"3b0bcf77-552b-4e36-aabc-4e46ed0594d2"
total_cost
0.01131
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
Evaluation details
Result
Evaluator
Details
Meta Data
3.6234%
Matches word count
n/a
neededClean
false
words
476
89.7510%
Dialogue to Total Word Ratio
Ratio: 51.47%, Deviation: 1.47%
neededClean
false
wordsTotal
476
wordsDialogue
245
46.6872%