NC Bench
Overview
About
Tests
Models
Vendors
Compare
Run: 2026-02-24-01KJ8V8006PV2NNDACPVT1334A
Run Details
Model:
Claude Opus 4.6 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
23.6s
Cost:
$0.020710
Provider:
Amazon Bedrock
Evaluation summary
Score
Evaluator
Details Link
99.38%
Matches word count
(details)
61.37%
Dialogue to Total Word Ratio
(details)
80.3725%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the hand-drawn map across the hood of their rusted sedan and pinned down the corners with two coffee cups, a half-eaten granola bar, and what appeared to be a rubber duck she had stolen from a previous job. Bill leaned over the map and squinted at what looked like a crayon rendering of a backyard. "Why is everything drawn in purple?" Bill asked. "Because I only had purple," Molly said. "Focus. The party starts at two. We've got a narrow window between the arrival of the bounce house and the appearance of the clown." "I hate clowns," Bill said. "Everyone hates clowns, Bill. That's not relevant to the operation." She tapped a large purple rectangle near the center of the map. "This is the gift table. Based on the Evite, we're looking at twenty-three confirmed attendees. That means twenty-three presents minimum, probably more because some overachieving parents always bring two." Bill let out a low whistle, though it came out more like a wheeze because he had been eating the granola bar. He brushed crumbs from his mustache and studied the layout with the gravity of a man planning to infiltrate a foreign embassy rather than a six-year-old's dinosaur-themed birthday party. "What's our entry point?" he asked. "Side gate," Molly said. "The mom posted her entire backyard renovation on Instagram last week. The latch is decorative. One good push and we're in." "And the dad?" "He'll be at the grill. Every dad is always at the grill. It's like a law of physics." Bill nodded slowly. He had been a dad once, briefly, before his ex-wife discovered he had stolen their wedding gifts and pawned them for seed money on a counterfeit beanie baby operation. He understood the gravitational pull of a grill. "What about the kids?" he asked. "Kids notice things. They're like tiny suspicious detectives with sticky hands." "That's where the bounce house comes in," Molly said, and her eyes took on the gleam they always got right before she explained the brilliant part. "At exactly two fifteen, I will puncture the bounce house with this." She produced a knitting needle from her jacket pocket and held it up like a sword. "Mass hysteria. Crying. Chaos. Every adult within fifty yards will converge on the deflating bounce house like moths to a porch light." "That's diabolical," Bill said. "Thank you." "What do I do during the chaos?" "You back the car up to the side gate, and we load the presents into the trunk. Sixty seconds, in and out." Bill rubbed his chin and stared at the purple map one more time. A bird landed on the hood of the car, looked at both of them, and flew away as if it wanted no part in what was about to happen. "One question," Bill said. "What if the presents are bad? What if it's all socks and educational toys?" Molly folded the map with surgical precision and looked him dead in the eye. "Then we hit the next party on Saturday."
created_at
"2026-02-24T22:10:18.766Z"
2/24/2026, 10:10:18 PM
model
"anthropic/claude-4.6-opus-20260205"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1590
moderation_latency
152
generation_time
23378
tokens_prompt
77
tokens_completion
836
native_tokens_prompt
92
native_tokens_completion
810
native_tokens_completion_images
(null)
native_tokens_reasoning
79
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.02071
router
(null)
provider_responses
0
endpoint_id
"a8708fb9-08ea-43bb-a3b1-165c87ef09df"
id
"58d30724-cae8-4ff3-8033-41e578a50bb3"
is_byok
false
latency
1552
model_permaslug
"anthropic/claude-4.6-opus-20260205"
provider_name
"Amazon Bedrock"
status
200
api_type
"completions"
id
"gen-1771971018-o9BGUghzcInrcmHcpnib"
upstream_id
"58d30724-cae8-4ff3-8033-41e578a50bb3"
total_cost
0.02071
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
Evaluation details
Result
Evaluator
Details
Meta Data
99.38%
Matches word count
n/a
neededClean
false
words
505
61.37%
Dialogue to Total Word Ratio
Ratio: 46.88%, Deviation: 3.13%
neededClean
false
wordsTotal
512
wordsDialogue
240
80.3725%