NC Bench
Overview
About
Tests
Models
Model
Claude Opus 4.5
Test
Dialogue tags
Scenario
Write 500 words with 70% dialogue
Run ID
2026-02-08-01KGYYD2ZT5EPBQSH1C55NDNQQ
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the crayon-drawn map across the sticky diner table and jabbed her finger at a wobbly rectangle. "That's the bouncy castle. Our primary obstacle." Bill squinted at the paper. "Why is it labeled 'death trap'?" "Because I've done my research, Bill. Do you know how many adults tear their ACLs in bouncy castles every year?" "I genuinely do not." "Neither do I, but it feels like a lot. We go around it, not through it." Bill stirred his coffee and leaned back in the booth. "Walk me through this again. We're stealing presents from a seven-year-old's birthday party?" "We're liberating presents from a seven-year-old's birthday party. There's a difference." "Is there, though?" "The kid's name is Maximilian Worthington the Fourth. His father owns half the commercial real estate in this city, including the building where my grandmother's bakery used to be before he tripled her rent." "So this is revenge?" "This is justice served with a side of buttercream frosting. The party starts at two. We infiltrate at two-fifteen during the magician's act." "There's a magician?" "The Amazing Rodrigo. He does a twenty-minute set with a rabbit that apparently bites. Everyone will be distracted." Bill pulled out a small notebook. "What's our cover?" "You're a clown." "Absolutely not." "Bill, you have to be the clown." "I have a degree in art history. I didn't study the Renaissance masters to wear oversized shoes." "Your degree is why you're broke enough to be doing this with me. Put on the nose." "What's your cover, then?" "I'm the balloon animal lady. We're a package deal. The Worthingtons hired entertainment through an agency, so I called and said we were complimentary add-ons." "And they believed you?" "Rich people never question free things. They assume they deserve them." Bill drummed his fingers on the table. "Where are the presents being kept?" "Pool house. Separate structure, maybe forty feet from the main party area. I need you to cause a distraction at two-twenty." "What kind of distraction?" "Make a balloon giraffe." "That's not very distracting." "It is if you make it wrong and it looks inappropriate. Parents will swarm you with complaints. I slip into the pool house, load the goods into my catering cart, and wheel everything out through the service entrance." "This feels morally complicated, Molly." "Does it help if I tell you the kid already has a pony? A real one. Named Chancellor Buttons." "That does help, actually." "Good. Now, there's one security guard, but he's mostly there to keep the pony from escaping again." "Again?" "Chancellor Buttons has a history. Focus, Bill. We rendezvous at the van by two-thirty-five. My cousin Terrence will be driving. He owes me for the incident we don't discuss." "You have a lot of incidents you don't discuss." "That's what makes me interesting." Molly folded the map and slid it into her jacket. "So, are you in?" Bill sighed heavily. "What color is my clown nose?"
created_at
"2026-02-08T15:37:47.855594+00:00"
Sunday, February 8th, 2026 at 3:37:47 PM GMT+00:00
model
"anthropic/claude-4.5-opus-20251124"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1717
moderation_latency
(null)
generation_time
21905
tokens_prompt
61
tokens_completion
667
native_tokens_prompt
75
native_tokens_completion
781
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0199
router
(null)
provider_responses
0
id
"msg_01Bp5f1KurSLd6uEAXTLLMdG"
status
200
is_byok
false
latency
1701
endpoint_id
"be883404-eb42-4b2d-b6e4-c7daa3aa8d62"
provider_name
"Anthropic"
model_permaslug
"anthropic/claude-4.5-opus-20251124"
api_type
"completions"
id
"gen-1770565045-PuYhN0zvWWX0BIL4XsQq"
upstream_id
"msg_01Bp5f1KurSLd6uEAXTLLMdG"
total_cost
0.0199
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Anthropic"
Evaluation details
Result
Evaluator
Details
Meta Data
99.9990%
Matches word count
n/a
neededClean
false
words
501
0.0000%
Dialogue to Total Word Ratio
Ratio: 88.02%, Deviation: 18.02%
neededClean
false
wordsTotal
501
wordsDialogue
441
49.9995%