NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-07-01-01KWE9XJHFW263WNH8CZVPCRSX
Run Details
Model:
Claude Sonnet 5 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
18.1s
Cost:
$0.012016
Provider:
Amazon Bedrock
Evaluation summary
Score
Evaluator
Details Link
90.48%
Matches word count
(details)
4.00%
Dialogue to Total Word Ratio
(details)
47.2409%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The warehouse smelled like mildew and bad decisions. Bill spread a hand-drawn map across a crate, weighing down the corners with a stapler and a half-eaten donut. "Okay, so," Bill said, tapping the paper with a grubby finger, "the party's at 2 PM. Bounce house goes up at noon." "Bill, it's a seven-year-old's birthday party. Why do we need a tactical map?" "Because presents don't just steal themselves, Molly." "They kind of do, actually. That's the whole point of stealing." Bill ignored her, tracing a route with his finger. "We come in through the side gate, disguised as clowns—" "Absolutely not." "—grab the present table, and we're out before the piñata even drops." "I said absolutely not to the clowns, Bill." "Kids trust clowns!" "Kids are *terrified* of clowns. Have you seen a horror movie in the last forty years?" Bill paused, scratching his chin. "Fine. Magicians, then." "Why do we need a costume at all? It's a backyard party. We climb the fence, grab the loot, we're gone." "That's not a *plan*, that's just crime." "It's supposed to be crime, Bill. We're thieves." "Yeah, but *sophisticated* thieves. There's a difference between stealing and just yoinking a Tonka truck over a fence like a raccoon." Molly crossed her arms. "Fine. What's your sophisticated plan?" Bill straightened, clearly pleased someone finally asked. "We rent a bounce house. A *fake* bounce house. We deliver it ourselves. Once it's set up, presents go inside for 'safekeeping' during the jumping. Then, boom—we deflate it, presents and all, roll it into the truck." Molly blinked. "You want to steal a bounce house full of birthday presents." "With the bounce house still attached, yes." "That's the dumbest thing I've ever heard." "It's genius." "It's a two-hundred-pound inflatable with a Barbie logo on it, Bill. It won't fit in the truck." "I measured!" "You measured *your* truck. Not the bounce house." Bill's face fell slightly. "...Okay, minor miscalculation." "Also, who rents us a bounce house without ID? You have a warrant out for grand theft popcorn machine." "That was a misunderstanding." "You stole a popcorn machine from a *movie theater*, Bill." "It was for a *party*!” "This isn't the point—" "Everything is for a party with you, and yet somehow *I'm* the unsophisticated one." Molly sighed, rubbing her temples. "Look. Simple plan. We show up as caterers. Everyone trusts a guy with a tray of little sausages. We grab the presents while the kids are distracted by cake, we walk out the front door like we own the place." Bill considered this, nodding slowly. "...I like the sausages." "Of course you like the sausages." "No, I mean, it's smart. Low profile. Very sophisticated." "Thank you." "Can I still wear the clown wig, though? Just the wig." Molly stared at him for a long moment. "If it makes you shut up about the bounce house, fine." Bill grinned, already reaching for his tray of little sausages.
created_at
"2026-07-01T07:39:56.347Z"
7/1/2026, 7:39:56 AM
model
"anthropic/claude-sonnet-5-20260630"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
3615
moderation_latency
(null)
generation_time
18012
tokens_prompt
77
tokens_completion
852
native_tokens_prompt
93
native_tokens_completion
1183
native_tokens_completion_images
(null)
native_tokens_reasoning
49
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.012016
router
(null)
provider_responses
0
endpoint_id
"37b16f45-eeae-45f3-9b85-2137f7c6d61b"
id
"msg_01WjooQ6g3xmivi5jacQHLpQ"
is_byok
false
latency
2346
model_permaslug
"anthropic/claude-sonnet-5-20260630"
provider_name
"Amazon Bedrock"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.17.0; linux; x64))"
http_referer
(null)
request_id
"req-1782891596-04q9WXTpFlDoSttQreCz"
session_id
(null)
preset_id
(null)
api_type
"completions"
id
"gen-1782891596-ZNaBfS4lUMJDT6BOinPG"
upstream_id
"msg_01WjooQ6g3xmivi5jacQHLpQ"
total_cost
0.012016
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
response_cache_source_id
(null)
data_region
"global"
Evaluation details
Result
Evaluator
Details
Meta Data
90.48%
Matches word count
n/a
neededClean
false
words
490
4.00%
Dialogue to Total Word Ratio
Ratio: 78.02%, Deviation: 8.02%
neededClean
false
wordsTotal
496
wordsDialogue
387
47.2409%