NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-07-01-01KWE8XHF3301CAR88H3F00NGE
Run Details
Model:
Claude Sonnet 5 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
19.8s
Cost:
$0.010956
Provider:
Amazon Bedrock
Evaluation summary
Score
Evaluator
Details Link
27.17%
Matches word count
(details)
0.02%
Dialogue to Total Word Ratio
(details)
13.5923%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The rain hammered against the windshield of the van, which was parked three houses down from the birthday party in question. Balloons—actual balloons, tied to a mailbox like some kind of taunting flag of surrender—bobbed cheerfully in the storm. Molly sat behind the wheel, chewing gum with the intensity of someone defusing a bomb, while Bill studied a crumpled piece of paper that may or may not have been an actual blueprint. "Explain to me again," Molly said, "why we're robbing an eight-year-old's birthday party." "Because," Bill said, tapping the paper, "Kevin Marsh Jr. is turning eight, and his grandfather is apparently some kind of toy magnate. Do you understand what that means? Every present in that house is going to be top-shelf. We're talking premium action figures. Possibly a drone." "Possibly a drone," Molly repeated flatly. "Possibly two drones." Molly stopped chewing. She turned to look at him, rain streaking down the glass behind her like the world itself was crying about the choices they were making. "Bill, we are professional thieves. We stole a Rembrandt last spring." "A print of a Rembrandt." "Still counts. And now you want us to steal Pokémon cards from a bounce house." "There's a bounce house?" Bill's eyes lit up like he'd just found out there was cake, which, to be fair, there probably was. "That's not the point." "It is a little bit the point." Molly dragged a hand down her face and looked out at the house, where streamers fluttered from the porch and a banner reading HAPPY 8TH BIRTHDAY KEVIN sagged wetly in the downpour. Somewhere inside, forty screaming children were no doubt hopped up on juice boxes and cake, blissfully unaware that two grown adults sat outside plotting their demise via gift theft. "Fine," she said. "Plan?" Bill straightened, suddenly serious, the way he always did right before revealing a plan that was almost certainly terrible. "We go in as entertainers. I've got a balloon animal kit. You do face painting." "I don't know how to face paint." "Nobody does, Molly. That's the beauty of it. Nobody questions a bad tiger." "And once we're in?" "We herd the kids toward the bounce house, distract the parents with the open bar—" "There's an open bar?" "It's a very nice neighborhood. Then we load the presents into the balloon animal bag." Molly blinked. "The bag of balloon animals is going to hold forty presents." "It's a big bag." "It is not a big enough bag, Bill." "We'll improvise." Molly sighed the sigh of a woman who had heard those words before, usually right before something caught fire. She reached into the back seat and pulled out a rubber tiger mask, considering it with the weary resignation of someone who had long ago stopped asking why her life had turned out this way. "If there's a clown in there already," she said, pulling the mask on, "I'm walking." "There's always a clown, Molly." "I hate this job." They stepped out into the rain, two thieves marching toward cake, chaos, and the faint, distant sound of a piñata being murdered.
created_at
"2026-07-01T07:22:26.671Z"
7/1/2026, 7:22:26 AM
model
"anthropic/claude-sonnet-5-20260630"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
4766
moderation_latency
(null)
generation_time
19764
tokens_prompt
77
tokens_completion
810
native_tokens_prompt
93
native_tokens_completion
1077
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.010956
router
(null)
provider_responses
0
endpoint_id
"37b16f45-eeae-45f3-9b85-2137f7c6d61b"
id
"msg_01F4XaDEpJYLDTCnEgvUbriC"
is_byok
false
latency
4766
model_permaslug
"anthropic/claude-sonnet-5-20260630"
provider_name
"Amazon Bedrock"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.17.0; linux; x64))"
http_referer
(null)
request_id
"req-1782890546-hdV9BYagHSBKfmVlQHtF"
session_id
(null)
preset_id
(null)
api_type
"completions"
id
"gen-1782890546-ta4ENEj59jGAXYtSSFKI"
upstream_id
"msg_01F4XaDEpJYLDTCnEgvUbriC"
total_cost
0.010956
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
response_cache_source_id
(null)
data_region
"global"
Evaluation details
Result
Evaluator
Details
Meta Data
27.17%
Matches word count
n/a
neededClean
false
words
519
0.02%
Dialogue to Total Word Ratio
Ratio: 43.10%, Deviation: 13.10%
neededClean
false
wordsTotal
522
wordsDialogue
225
13.5923%