NC Bench
Overview
About
Tests
Models
Model
Claude Sonnet 4.5
Test
Dialogue tags
Scenario
Write 500 words with 70% dialogue
Run ID
2026-02-08-01KGYWXKR824ZERR2GQ4S6KANX
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly pressed her binoculars against the window of the van, squinting at the suburban house across the street. Streamers and balloons covered every surface. "I count at least thirty kids," she said. "Thirty witnesses," Bill corrected, unwrapping his third granola bar of the hour. "You sure about this?" "Are you seriously getting cold feet about robbing a six-year-old?" "When you say it like that, it sounds bad." "It is bad, Bill. That's literally what we do. Bad things." Bill crumpled the wrapper and tossed it into the back of the van. "Yeah, but we usually rob bad people. Drug dealers. Corrupt politicians. That guy who ran the puppy mill." "Tommy Fitzgerald's parents are hedge fund managers." "The birthday boy is named Tommy Fitzgerald?" "I know, right? He was basically born to be robbed." "Still doesn't feel great." Molly lowered the binoculars and turned to face him. "Bill, do you remember what you said when we started this partnership?" "Don't eat tuna in the van?" "Before that." "Always check for security cameras?" "Bill." "Oh! No honor among thieves?" "You said you were tired of being poor and having principles was expensive." Bill sighed. "I also said that while drunk on boxed wine." "A verbal contract is still a contract." "That's not how that works." Molly pointed out the window. "Look at that house. Four car garage. In-ground pool. There's probably ten grand worth of presents in there, minimum. Meanwhile, we're eating generic granola bars in a van that smells like old cheese." "That's your shoes." "My point stands." Bill rubbed his temples. "What's the plan?" "See that bounce house?" "The castle one?" "You're going to hide in there. When the parents bring out the cake, I'll pull the fire alarm. In the chaos, you grab the presents and pass them through the bounce house exit. I'll be waiting with the wagon." "What wagon?" "The red one I borrowed from the neighbor kid." "You stole a wagon to help us steal from a children's party?" "Borrowed." "Molly." "Focus, Bill. Are you in or out?" Bill stared at the house for a long moment, watching tiny children run in circles, hopped up on juice boxes and cake. A clown was making balloon animals near the driveway. "Is that guy union?" Bill asked. "What?" "The clown. If he's non-union, I'm definitely in. I hate scab clowns." "I'm not asking about his labor credentials." "You should. It's important to have standards." "Says the man about to rob a first grader." "Tommy's six, you said." "That's not better, Bill." "Fine. I'm in. But we're donating five percent to charity." "Two percent." "Deal." They shook hands and Molly grabbed her fake firefighter helmet from the dashboard.
created_at
"2026-02-08T15:11:46.716305+00:00"
Sunday, February 8th, 2026 at 3:11:46 PM GMT+00:00
model
"anthropic/claude-4.5-sonnet-20250929"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
2996
moderation_latency
226
generation_time
16346
tokens_prompt
61
tokens_completion
618
native_tokens_prompt
75
native_tokens_completion
725
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0111
router
(null)
provider_responses
0
id
"b1eeac9b-b983-4cc5-9cfc-682da8a14ba7"
status
200
is_byok
false
latency
2996
endpoint_id
"d8bee848-f020-454a-b2be-4a7ce868f99f"
provider_name
"Amazon Bedrock"
model_permaslug
"anthropic/claude-4.5-sonnet-20250929"
api_type
"completions"
id
"gen-1770563489-dC8oac16aXEE5RU9ORAY"
upstream_id
"b1eeac9b-b983-4cc5-9cfc-682da8a14ba7"
total_cost
0.0111
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
Evaluation details
Result
Evaluator
Details
Meta Data
0.0000%
Matches word count
n/a
neededClean
false
words
453
34.4967%
Dialogue to Total Word Ratio
Ratio: 74.61%, Deviation: 4.61%
neededClean
false
wordsTotal
453
wordsDialogue
338
17.2484%