NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-07-01-01KWEAEVS9BAJ4M7BC9R0C2PP0
Run Details
Model:
Claude Sonnet 5 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
18.5s
Cost:
$0.011776
Provider:
Amazon Bedrock
Evaluation summary
Score
Evaluator
Details Link
99.98%
Matches word count
(details)
4.45%
Dialogue to Total Word Ratio
(details)
52.2190%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The warehouse smelled like motor oil and stale donuts, which suited Bill just fine. He spread the blueprints across the workbench—not blueprints, actually, just a crayon drawing he'd bought off a kid outside the target house for five bucks. "This is not a blueprint, Bill," Molly said, squinting at it. "This is a picture of a dog." "It's a *layout*, Molly. See? That's the yard. That's the house. The dog is incidental." "There's a smiley face on the chimney." "Artistic license." Molly rubbed her temples. "Okay. Fine. Let's just talk logistics. It's a seven-year-old's birthday party. There'll be, what, twenty kids, a bounce house, a clown—" "No clowns. I have a thing about clowns." "Since when?" "Since always. Clowns see through souls, Molly. They *know* things." "The clown is not going to know we're stealing the presents, Bill." "That's exactly what a clown who knows would say I'd say." Molly stared at him for a long moment. "That sentence didn't make sense." "Everything makes sense to a clown." She decided to move on before her brain started leaking out of her ears. "The presents will be on a table near the cake. We go in during the piñata, when everyone's distracted, we grab the loot, we're out in ninety seconds." "Ninety seconds is tight." "It's a *pile of Legos and gift cards*, Bill, not the Hope Diamond." "You don't know that. Rich kids get weird stuff now. I heard about a six-year-old who got a hoverboard shaped like a shark." "Then we steal the shark hoverboard and retire." "See, now you're talking heist." Bill tapped the crayon drawing. "What about security?" "Security is a golden retriever named Biscuit and a dad who's had four beers by two o'clock." "Biscuit," Bill repeated slowly, testing the name like it might bite him. "Dogs like me. Dogs *trust* me." "Every dog you've ever met has bitten you." "That's a coincidence pattern, not a trend." "You got bitten by a chihuahua last week." "That thing had a vendetta, Molly, it's not the same—" "Focus." She jabbed a finger at the drawing. "Piñata goes up at 2:15. We're dressed as the entertainment—" "I still think we should be magicians." "You can't do magic." "I can do the cup trick." "You dropped the cup on your own foot in rehearsal." "That WAS the trick. Misdirection." "Nobody was misdirected, Bill, you just yelled and hopped around for ten minutes." "They didn't suspect a thing." Molly closed her eyes, breathed in, breathed out. "We're going as the balloon guy and his assistant. Simple. Low profile. Sneak the presents into balloon animals—er, into the bags disguised as balloon supplies—and walk out the front door while everyone's yelling about candy." Bill considered this. "And if Biscuit comes after me?" "Then you become very fast, very suddenly." "That's not a plan, that's a *hope*." "It's both," Molly said, folding up the drawing. "Now go find a balloon costume that doesn't say 'sex offender' before we start."
created_at
"2026-07-01T07:49:22.869Z"
7/1/2026, 7:49:22 AM
model
"anthropic/claude-sonnet-5-20260630"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
3149
moderation_latency
(null)
generation_time
18434
tokens_prompt
77
tokens_completion
847
native_tokens_prompt
93
native_tokens_completion
1159
native_tokens_completion_images
(null)
native_tokens_reasoning
44
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.011776
router
(null)
provider_responses
0
endpoint_id
"37b16f45-eeae-45f3-9b85-2137f7c6d61b"
id
"msg_01TPK4w3q2EN6wREwd8qowJB"
is_byok
false
latency
1661
model_permaslug
"anthropic/claude-sonnet-5-20260630"
provider_name
"Amazon Bedrock"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.17.0; linux; x64))"
http_referer
(null)
request_id
"req-1782892162-Tn4IJdqZPpmAMx2pKmeU"
session_id
(null)
preset_id
(null)
api_type
"completions"
id
"gen-1782892162-mdol2FPPIMphUE1l3UmW"
upstream_id
"msg_01TPK4w3q2EN6wREwd8qowJB"
total_cost
0.011776
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
response_cache_source_id
(null)
data_region
"global"
Evaluation details
Result
Evaluator
Details
Meta Data
99.98%
Matches word count
n/a
neededClean
false
words
498
4.45%
Dialogue to Total Word Ratio
Ratio: 77.89%, Deviation: 7.89%
neededClean
false
wordsTotal
502
wordsDialogue
391
52.2190%