NC Bench
Overview
About
Tests
Models
Model
Claude Opus 4.5
Test
Dialogue tags
Scenario
Write 500 words with 50% dialogue
Run ID
2026-02-08-01KGYYNQF2FT8RVQD714FKETVR
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The van smelled like stale coffee and regret, which Molly had come to recognize as the official scent of their operations. She squinted through binoculars at the suburban house across the street, where balloon animals bobbed in the afternoon breeze and the distant screams of sugar-fueled children pierced the air. "Run it by me again," Bill said, adjusting his fake mustache in the rearview mirror. "Because I'm still not entirely clear on why we've sunk this low." "We haven't sunk low. We've pivoted to an underserved market." Molly lowered the binoculars and pulled out a hand-drawn map that looked like it had been sketched by someone having a seizure. "The Hendersons are loaded. We're talking about a pile of presents worth at least three grand in there." "They're also seven-year-olds, Molly. Seven-year-olds who believe in magic and the tooth fairy." "Perfect. Then they'll think a wizard took their stuff. We're practically creating childhood memories." Bill rubbed his temples with the weariness of a man who had made too many poor life choices. The costume she had given him lay crumpled in his lap, a garish purple ensemble with oversized buttons. "Why do I have to be the clown?" "Because you have the face for it. Naturally sad. Very authentic." "I hate you with every fiber of my being." "Save the sweet talk for after the job." Molly pointed at the map. "You go in through the back gate during the magic show. The parents will be distracted watching some guy pull rabbits out of hats." "What about the dog?" "I've got turkey slices in my pocket. Dogs love me." "Everyone loves you until they realize their valuables are missing." She grinned at that, taking it as the compliment it absolutely was not. The plan was simple, which historically meant everything would go spectacularly wrong. But Molly had learned to embrace chaos like an old friend who owed her money. "Once you're inside, head for the dining room. That's where they're keeping the gift table." "And security?" "It's a children's party, Bill. The heaviest security is a baby gate." He stared at her for a long moment. "Last month we were planning to hit an art gallery." "And now we're building toward that. Consider this practice." "Practice for stealing from toddlers." "They're seven. Totally different developmental stage. Besides, half those presents are probably socks and educational toys. We're doing them a favor." Bill pulled on the clown wig with the resignation of a condemned man. The orange curls framed his face in a way that was genuinely disturbing. "If I get taken down by a bunch of kids hopped up on cake, I'm blaming you in my memoir." "You can't even spell memoir." "I'll hire a ghostwriter with my cut." Molly started the van, checking her own costume in the mirror. The balloon artist outfit was convincing enough, assuming nobody asked her to actually make anything. "Let's go steal some birthday presents."
created_at
"2026-02-08T15:42:28.231982+00:00"
Sunday, February 8th, 2026 at 3:42:28 PM GMT+00:00
model
"anthropic/claude-4.5-opus-20251124"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1636
moderation_latency
136
generation_time
19230
tokens_prompt
61
tokens_completion
620
native_tokens_prompt
75
native_tokens_completion
727
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.01855
router
(null)
provider_responses
0
id
"4d84e331-f3e7-43b4-bdb4-2ab1b10e8a87"
status
200
is_byok
false
latency
1636
endpoint_id
"de32581c-2e11-4f54-8d29-e8a6e19d3038"
provider_name
"Amazon Bedrock"
model_permaslug
"anthropic/claude-4.5-opus-20251124"
api_type
"completions"
id
"gen-1770565328-6BeybHVUoew3h62ZjC2b"
upstream_id
"4d84e331-f3e7-43b4-bdb4-2ab1b10e8a87"
total_cost
0.01855
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
Evaluation details
Result
Evaluator
Details
Meta Data
99.9190%
Matches word count
n/a
neededClean
false
words
497
32.6974%
Dialogue to Total Word Ratio
Ratio: 54.73%, Deviation: 4.73%
neededClean
false
wordsTotal
497
wordsDialogue
272
66.3082%