NC Bench
Overview
About
Tests
Models
Vendors
Compare
Run: 2026-02-24-01KJ8WM59D01VE3G2MEEVHYQE1
Run Details
Model:
Claude Opus 4.6 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
22.2s
Cost:
$0.021585
Provider:
Amazon Bedrock
Evaluation summary
Score
Evaluator
Details Link
99.98%
Matches word count
(details)
5.66%
Dialogue to Total Word Ratio
(details)
52.8203%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the hand-drawn map across the hood of the van, smoothing out a wrinkle that bisected the bounce house. Bill squinted at it, turning his head sideways like a confused labrador. "Why is there a dragon in the backyard?" Bill asked. "That's the birthday boy's mother. I've done reconnaissance, Bill. She's six foot two and she owns a whistle. She will use that whistle." "Okay, so we avoid the dragon. What's our entry point?" "The side gate. It's held shut by one of those little latch things that you lift up. Military grade, basically, so I'll need you to jiggle it." "I can jiggle." "I know you can. That's why I brought you on. Now, the presents will be stacked on a folding table near the sliding glass door. I'm estimating fifteen to twenty gifts, mostly medium-sized, probably a lot of Legos and gift cards." "Gift cards are light. That's good for a quick exit." "Exactly. But here's where it gets complicated. There's going to be approximately twelve children hopped up on sheet cake and fruit punch. They're unpredictable, Bill. They move in swarms. One minute they're doing the Macarena, next minute they're just screaming at the sky for no reason." Bill nodded gravely. "My nephew's like that. He bit a clown once." "See, that's what we're dealing with. Tiny agents of chaos. So our window is the piñata." "The piñata?" "When that piñata breaks open, every single child in that yard is going to dive for candy like seagulls on a french fry. That gives us roughly ninety seconds of pure, unmonitored access to the present table." "That's actually brilliant." "I know. I've been planning this for three days. I missed my dentist appointment for this." Molly pulled a stopwatch from her jacket pocket and held it up between them like a sacred artifact. "I'll be on the stopwatch. You'll be on gift extraction. Grab everything you can carry, toss it over the fence, and I'll load the van." "What if one of the kids sees me?" "You tell them you're a party magician and you're about to make the presents disappear." "That's technically not a lie." "Exactly. We maintain our integrity." Bill scratched the back of his neck and stared at the map again. A breeze lifted one corner and he slapped it back down. "Molly, can I ask you something? Do you ever feel bad about this? Stealing from children?" She paused, looking off into the middle distance where a squirrel was aggressively eating a bagel on somebody's lawn. "Every single time, Bill. Every single time." "But we still do it." "Of course we still do it. Do you know how much Lego sets go for on the secondary market? We'd be fools not to." "Fair enough." She folded up the map with surgical precision and tucked it inside her jacket. "We go at three. Right after Happy Birthday, right before cake. Any questions?" "Just one. Can I have some of the piñata candy?" "Absolutely not. Stay focused."
created_at
"2026-02-24T22:34:25.974Z"
2/24/2026, 10:34:25 PM
model
"anthropic/claude-4.6-opus-20260205"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1695
moderation_latency
216
generation_time
21774
tokens_prompt
77
tokens_completion
841
native_tokens_prompt
92
native_tokens_completion
845
native_tokens_completion_images
(null)
native_tokens_reasoning
65
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.021585
router
(null)
provider_responses
0
endpoint_id
"a8708fb9-08ea-43bb-a3b1-165c87ef09df"
id
"f10146fe-9c4e-40b6-803a-6c9b8e36618f"
is_byok
false
latency
1608
model_permaslug
"anthropic/claude-4.6-opus-20260205"
provider_name
"Amazon Bedrock"
status
200
api_type
"completions"
id
"gen-1771972465-WXnZLhUeWzLcrv7INZEE"
upstream_id
"f10146fe-9c4e-40b6-803a-6c9b8e36618f"
total_cost
0.021585
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
Evaluation details
Result
Evaluator
Details
Meta Data
99.98%
Matches word count
n/a
neededClean
false
words
502
5.66%
Dialogue to Total Word Ratio
Ratio: 77.58%, Deviation: 7.58%
neededClean
false
wordsTotal
504
wordsDialogue
391
52.8203%