NC Bench
Overview
About
Tests
Models
Model
Claude Sonnet 4.5
Test
Dialogue tags
Scenario
Write 500 words with 50% dialogue
Run ID
2026-02-08-01KGYXBHFKBCZ5S4N1XV5MCJQP
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly pressed her back against the fence, peering through a knothole at the chaos of bouncing children beyond. Bill crouched beside her, adjusting his fake mustache. "I still think this is our dumbest job yet," Bill whispered. "You said that about the golf cart heist." "That was also extremely dumb." Molly pulled out a crumpled party invitation from her jacket. "According to this, present time is in fifteen minutes. That's our window." "Our window to steal from children. Actual children." Bill tugged at his mustache nervously. "What happened to us? We used to have standards." "Standards don't pay rent." Molly scanned the backyard. A clown was making balloon animals near the pool. "Besides, these aren't just any presents. Rich lawyer's kid. Probably got a new iPad, designer sneakers, maybe even jewelry." "From who? His rich lawyer infant friends?" "Focus, Bill." Bill sighed. "Fine. What's the plan?" "I go in as a caterer. You're the backup clown." "I'm the what now?" "Backup clown." Molly gestured at his outfit. "Hence the costume I specifically told you to wear." Bill looked down at his striped shirt and oversized pants. "I thought we were robbing a mime convention later." "There's no such thing as a mime convention." "There should be. Very quiet. Organized." Molly ignored him. "The presents are in the dining room. I grab the good stuff, pass it to you through the kitchen window, you load the van." "What if someone notices you're not actually serving food?" "I'll carry a tray. People are idiots." "Reassuring." Bill scratched his fake nose. It was three sizes too large. "What about the clown? Won't he notice another clown?" "Professional courtesy. Clowns don't question other clowns." "That's not a real thing." "How do you know?" "Because nothing you just said makes sense!" Bill hissed. A balloon popped in the distance and he jumped. "This is insane. We're going to get caught and go to prison for stealing from a seven-year-old's birthday party." "He's nine." "Oh, well, that makes it totally fine then." Molly checked her watch. "Look, you want to pay back your bookie or not?" "I want to pay him back with money we stole from adults. Normal heist money." "There's no such thing as normal heist money." "Bank money! Casino money!" Bill's mustache was sliding off. "Not Lego money!" "Stop being dramatic." Molly straightened her apron. "In and out. Quick and clean. Nobody gets hurt." "Except emotionally scarred children." "They're rich. They'll get therapy." Bill slumped against the fence. "I hate that I'm actually considering this." "That's the spirit." Molly grinned and squeezed his shoulder. "Ready?" "No." "Perfect. Let's go." They stood simultaneously. Bill readjusted his mustache one final time and pulled out a bag of balloon animals from his pocket. "Where did you even get those?" Molly asked. "Dollar store. If I'm doing this, I'm committing to the bit." "Now who's being dramatic?" They headed toward the gate.
created_at
"2026-02-08T15:19:22.233986+00:00"
Sunday, February 8th, 2026 at 3:19:22 PM GMT+00:00
model
"anthropic/claude-4.5-sonnet-20250929"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1453
moderation_latency
230
generation_time
15511
tokens_prompt
61
tokens_completion
688
native_tokens_prompt
75
native_tokens_completion
803
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.01227
router
(null)
provider_responses
0
id
"20d88cd9-1828-44bf-80cc-32377e7bde7c"
status
200
is_byok
false
latency
1416
endpoint_id
"d8bee848-f020-454a-b2be-4a7ce868f99f"
provider_name
"Amazon Bedrock"
model_permaslug
"anthropic/claude-4.5-sonnet-20250929"
api_type
"completions"
id
"gen-1770563945-IIeWyALer3JOMLyiP4Aw"
upstream_id
"20d88cd9-1828-44bf-80cc-32377e7bde7c"
total_cost
0.01227
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
Evaluation details
Result
Evaluator
Details
Meta Data
75.1556%
Matches word count
n/a
neededClean
false
words
487
0.0000%
Dialogue to Total Word Ratio
Ratio: 68.99%, Deviation: 18.99%
neededClean
false
wordsTotal
487
wordsDialogue
336
37.5778%