NC Bench
Overview
About
Tests
Models
Vendors
Compare
Run: 2026-02-24-01KJ8VDY46PZ1CM9XTAQSYDR62
Run Details
Model:
Claude Opus 4.6 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
21.3s
Cost:
$0.021435
Provider:
Amazon Bedrock
Evaluation summary
Score
Evaluator
Details Link
68.10%
Matches word count
(details)
0.02%
Dialogue to Total Word Ratio
(details)
34.0606%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the hand-drawn map across the hood of the van, smoothing out a wrinkle that bisected the bouncy castle. Bill leaned over her shoulder, squinting. "Walk me through it again," Bill said. "And this time, use the code names." "Fine. The target is a six-year-old's birthday party at 47 Maple Drive. The client — that's Tommy's older brother, by the way, who is paying us in a combination of Halloween candy and a gift card to Applebee's — wants every single present removed before they can be opened." "And we're doing this why?" "Because we're professionals, Bill. We don't ask why. We execute. Also I really want that gift card." "Right. What's our window?" "Cake and ice cream starts at two fifteen. Based on my surveillance, which was me sitting in this van with binoculars for three hours yesterday, every child at the party will abandon whatever they're doing and swarm the kitchen like a pack of feral raccoons the moment the candles are lit. That gives us approximately four minutes to enter through the back sliding door, grab every present from the table in the living room, and load them into the van." "Four minutes," Bill repeated. "For what, thirty presents?" "Twenty-six. I counted." "You counted the presents through binoculars?" "I'm thorough." "You're unhinged." "Those are the same thing in this line of work. Now, I need you focused. You're on bag duty." She handed him two large black duffel bags. "I'll hand you the presents, you stuff them in. No sorting, no stacking, just chaos. Speed over elegance." "What about the parents?" "The mom will be in the kitchen managing the cake. The dad — and this is the beautiful part — the dad falls asleep on the couch every single day at two o'clock. Every day, Bill. I've watched him through the window. The man is a metronome of exhaustion." "You've been watching him sleep?" "Surveilling. There's a difference." "There absolutely is not." "Can we stay on task? Now, there's one variable I haven't been able to account for." Molly tapped the map where she had drawn a small figure with what appeared to be horns. "The family dog." "What kind of dog?" "Chihuahua." Bill exhaled sharply. "No. Absolutely not. I'm out." "It's a six-pound dog, Bill." "Those things are demons with legs. Last job I did that involved a chihuahua, I lost a perfectly good pair of pants and most of my dignity." "I bought treats. Premium ones. Bacon flavored." "Bacon won't stop a chihuahua. Nothing stops a chihuahua. They're powered by hatred and spite." "Then move faster. Look, we get in, we grab the presents, we get out. We meet the client behind the Dairy Queen at three o'clock for the exchange. Simple." Bill stared at the map for a long moment, then folded his arms. "I want half the gift card." "Deal." They shook hands over the hood, and Molly started the van.
created_at
"2026-02-24T22:13:33.456Z"
2/24/2026, 10:13:33 PM
model
"anthropic/claude-4.6-opus-20260205"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1633
moderation_latency
159
generation_time
21067
tokens_prompt
77
tokens_completion
834
native_tokens_prompt
92
native_tokens_completion
839
native_tokens_completion_images
(null)
native_tokens_reasoning
73
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.021435
router
(null)
provider_responses
0
endpoint_id
"a8708fb9-08ea-43bb-a3b1-165c87ef09df"
id
"de6a7e51-e5b1-4543-82f9-a80497149d44"
is_byok
false
latency
1599
model_permaslug
"anthropic/claude-4.6-opus-20260205"
provider_name
"Amazon Bedrock"
status
200
api_type
"completions"
id
"gen-1771971213-dbphNkdupFJ8D4tRrLAl"
upstream_id
"de6a7e51-e5b1-4543-82f9-a80497149d44"
total_cost
0.021435
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
Evaluation details
Result
Evaluator
Details
Meta Data
68.10%
Matches word count
n/a
neededClean
false
words
486
0.02%
Dialogue to Total Word Ratio
Ratio: 83.10%, Deviation: 13.10%
neededClean
false
wordsTotal
491
wordsDialogue
408
34.0606%