NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-07-01-01KWE9SWHS6P452D8CEGMN9NET
Run Details
Model:
Claude Sonnet 5 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
20.0s
Cost:
$0.011636
Provider:
Amazon Bedrock
Evaluation summary
Score
Evaluator
Details Link
75.16%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
37.5778%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The back room of Bill's Pawn & Loan smelled like mothballs and bad decisions, which suited Molly just fine since that was basically her autobiography. She spread a crumpled piece of paper across the table—a birthday invitation, complete with a cartoon dinosaur wearing a party hat. "Timmy Fletcher," Molly said, tapping the dinosaur's cheerful face. "Turning seven. Superhero themed. Address is 42 Birch Lane." Bill squinted at the invitation like it might bite him. "We're robbing a seven-year-old." "We're robbing his presents. There's a difference." "Is there?" "Legally? Probably not. Morally? Also probably not. But financially, Bill, financially there's a *huge* difference. Have you seen what kids get these days? Drones. Tablets. I heard the Hendricks kid got a Rolex last Christmas." "A seven-year-old with a Rolex." "Life's not fair, Bill. That's the whole reason we're in this business." Bill rubbed his temples like he was trying to massage some sense into his own skull. "So what's the plan? We just—what—walk in?" "We walk in dressed as the entertainment. I already called the number on the invite pretending to be the magician's agent. Told them Marvelous Marco can't make it, food poisoning, tragic, very sudden. But don't worry, I've got a replacement." "And the replacement is us." "The replacement is *you*. I'm your lovely assistant." "I don't know any magic tricks, Molly." "You don't need magic tricks. You need a top hat, a cape, and the confidence of a man who's never once been wrong in his entire life." "That's called being a sociopath." "Exactly! You're a natural." Bill sat down heavily, the chair groaning in solidarity with his life choices. "And how exactly do we walk out with two hundred pounds of Nerf guns and Barbie dreamhouses without twenty parents noticing?" "That's the beautiful part." Molly's eyes lit up in the way that always preceded either genius or disaster, and frankly, the ratio had never been great. "During the show, you'll do the classic disappearing act. Big box, dramatic reveal, smoke and mirrors. Except instead of you disappearing, the *presents* disappear. Into the box. Which I then wheel out to the truck because I'm 'getting more props from the van.'" Bill considered this. "That's actually not terrible." "I know." "There's still twenty parents in that room, Molly." "Twenty parents watching a magic show while eating cake and drinking box wine at two in the afternoon on a Saturday. Trust me, nobody's paying attention to anything except their phones and their own crumbling marriages." "Dark." "Accurate." Bill picked up the invitation again, studying the dinosaur's unbearably cheerful smile. "What if the kid cries?" "The kid gets a magic show and a story he'll tell for years. 'The day the magician stole my birthday presents'—that's practically legendary. We're doing him a favor." "We are absolutely not doing him a favor." "Details, Bill. Now go find a top hat. We've got a seven-year-old to rob."
created_at
"2026-07-01T07:37:55.524Z"
7/1/2026, 7:37:55 AM
model
"anthropic/claude-sonnet-5-20260630"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
3144
moderation_latency
(null)
generation_time
19950
tokens_prompt
77
tokens_completion
844
native_tokens_prompt
93
native_tokens_completion
1145
native_tokens_completion_images
(null)
native_tokens_reasoning
51
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.011636
router
(null)
provider_responses
0
endpoint_id
"37b16f45-eeae-45f3-9b85-2137f7c6d61b"
id
"msg_01VG6KQJgNA4R7tuwtcRGGAF"
is_byok
false
latency
1912
model_permaslug
"anthropic/claude-sonnet-5-20260630"
provider_name
"Amazon Bedrock"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.17.0; linux; x64))"
http_referer
(null)
request_id
"req-1782891475-jsAwas2eGednBJJ8yTTG"
session_id
(null)
preset_id
(null)
api_type
"completions"
id
"gen-1782891475-vSmZrNna9IgjgYb9rH18"
upstream_id
"msg_01VG6KQJgNA4R7tuwtcRGGAF"
total_cost
0.011636
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
response_cache_source_id
(null)
data_region
"global"
Evaluation details
Result
Evaluator
Details
Meta Data
75.16%
Matches word count
n/a
neededClean
false
words
487
0.00%
Dialogue to Total Word Ratio
Ratio: 73.63%, Deviation: 23.63%
neededClean
false
wordsTotal
493
wordsDialogue
363
37.5778%