NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-07-01-01KWE8TCXA03HRGFQAPVWNKW1Z
Run Details
Model:
Claude Sonnet 5 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
16.6s
Cost:
$0.011186
Provider:
Amazon Bedrock
Evaluation summary
Score
Evaluator
Details Link
99.74%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
49.8722%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The back room of Bill's Pawn & Loan smelled like mothballs and bad decisions. Molly spread the blueprints across the counter, her finger tracing a line toward what appeared to be a bounce house. "A six-year-old's birthday party," Bill said, staring at the plans like they might bite him. "You want us to rob a six-year-old's birthday party." "Not the kid," Molly said. "The presents. There's a difference." "Is there?" "Bill, there's gonna be like forty gifts on that table. Legos, video games, probably a bike. That's real money." Bill rubbed his face. "We're professionals, Molly. We hit banks. We hit jewelry stores. We do not hit Chuck E. Cheese birthday parties." "It's not at Chuck E. Cheese, it's at a private residence, and would you keep your voice down? Someone might hear you complaining about robbing children." "I'm not robbing children, I'm complaining about robbing children, there's a difference." "That's literally what I said about the presents." Bill picked up the blueprint again, squinting. "There's a clown marked on here. Why is there a clown marked on here?" "Security detail." "The clown is security?" "The clown is a distraction. For us. We dress as clowns, nobody questions two more clowns at a kid's party." "I am not wearing a rainbow wig." "You're not wearing a rainbow wig, I'm wearing a rainbow wig, you're wearing the sad hobo clown, the one with the tears painted on." "Why do I get the sad one?" "Because you have a face for it, Bill." Bill opened his mouth, closed it, opened it again. "Fine. Fine! But how do we get forty presents out without thirty screaming children tackling us?" "Cake." "Cake?" "We time it with the cake. Everybody's cake-drunk, sugar-blind, eyes on the candles. That's our window. In, bag the gifts, out through the side gate before anyone notices the pile by the tree looks a little thin." "A little thin? Molly, we're taking all of them." "We're taking most of them. I'm leaving the one from Grandma. It's clearly a sweater." "That's very generous of you." "I have a code." Bill sat down heavily on a stool that creaked in protest. "This is insane. We're gonna get caught by a bouncy castle attendant." "His name's Gary and he's seventeen and he's on his phone the entire time, I've cased this for a week." "You've cased a birthday party for a week." "I take my work seriously." "You take candy from babies seriously, is what you take seriously." "Bill, the birthday boy is turning six, he already has a PlayStation, a trampoline, and a golden retriever named Biscuit. He does not need forty more presents. We're basically doing charity work in reverse." Bill stared at her for a long moment, then reached for the rainbow wig sitting on the counter and turned it over in his hands like it owed him money. "If we get caught," he said, "I'm blaming the clown costume." "That's fair," Molly said. "That's fair."
created_at
"2026-07-01T07:20:43.704Z"
7/1/2026, 7:20:43 AM
model
"anthropic/claude-sonnet-5-20260630"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
3233
moderation_latency
(null)
generation_time
16513
tokens_prompt
77
tokens_completion
824
native_tokens_prompt
93
native_tokens_completion
1100
native_tokens_completion_images
(null)
native_tokens_reasoning
49
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.011186
router
(null)
provider_responses
0
endpoint_id
"37b16f45-eeae-45f3-9b85-2137f7c6d61b"
id
"msg_01Umh7v9C62CUgEbQPLWJ41K"
is_byok
false
latency
1532
model_permaslug
"anthropic/claude-sonnet-5-20260630"
provider_name
"Amazon Bedrock"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.17.0; linux; x64))"
http_referer
(null)
request_id
"req-1782890443-aC4TfTvAEutsmXaJFTnF"
session_id
(null)
preset_id
(null)
api_type
"completions"
id
"gen-1782890443-1oeXJItaKgPExtUCw4ZI"
upstream_id
"msg_01Umh7v9C62CUgEbQPLWJ41K"
total_cost
0.011186
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
response_cache_source_id
(null)
data_region
"global"
Evaluation details
Result
Evaluator
Details
Meta Data
99.74%
Matches word count
n/a
neededClean
false
words
496
0.00%
Dialogue to Total Word Ratio
Ratio: 77.69%, Deviation: 27.69%
neededClean
false
wordsTotal
502
wordsDialogue
390
49.8722%