NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-07-01-01KWE8Y60FG3V6W75QN50KFH01
Run Details
Model:
Claude Sonnet 5 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
16.7s
Cost:
$0.011456
Provider:
Amazon Bedrock
Evaluation summary
Score
Evaluator
Details Link
27.17%
Matches word count
(details)
30.15%
Dialogue to Total Word Ratio
(details)
28.6564%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The back booth of Denny's smelled like burnt coffee and bad decisions, which suited Molly and Bill just fine. "So let me get this straight," Molly said, sliding the crayon drawing across the table. "We're stealing presents from a six-year-old's birthday party." "Not just any six-year-old," Bill said, tapping the paper with a greasy finger. "Timmy Fitzgerald. Son of Reginald Fitzgerald, hedge fund manager, owner of a wine cellar worth more than my apartment." "Bill, there's a unicorn on this drawing wearing sunglasses." "That's not relevant to the operation." "There's also a stick figure labeled 'Daddy's Guard Dog' that's clearly a golden retriever named Biscuit." "Biscuit is thirteen years old and has hip dysplasia. He's not a factor." Molly leaned back, crossing her arms. "How do you even know all this?" "I did my research." "You stalked a children's birthday party." "I did *reconnaissance*." "You're going to prison for reconnaissance, Bill." "The presents, Molly. Focus on the presents. We're talking PlayStations, tablets, some kid probably got a literal pony because his father doesn't understand the concept of 'too much.'" "So the plan is what, exactly? We just walk in?" "We walk in dressed as clowns." Molly stared at him. "Clowns." "Every kid's party needs entertainment. Nobody questions clowns." "Bill, every child alive is terrified of clowns. That's not camouflage, that's a horror movie premise." "Fine, we go as the bounce house delivery guys." "We don't have a bounce house." "We could rent one." "That defeats the entire purpose of *stealing* things, Bill! We'd be spending money to steal presents from a child!" Bill considered this, chewing his lip. "What if we're just late for the pizza delivery?" "Do we have pizza?" "We could rent that too—" "We are not renting pizza!" A waitress approached, refilled their coffees without making eye contact, clearly trained to ignore whatever conversations happened in this booth. She left. Molly waited until she was gone. "Here's what we do," Molly said, pulling out her own notepad, considerably less crayon-adjacent. "We show up as face painters. Cheap costumes, easy exit, and it puts us directly next to the gift table for two hours." Bill blinked. "That's... actually good." "I know it's good, I'm the one with the plan that doesn't involve inflatable structures." "What about Biscuit?" "Bill, I promise you, the dog is not the problem." "He could bite us." "He has hip dysplasia. He can barely make it to his water bowl." "I read that dogs can sense fear." "I read that dogs can sense stupidity, so you might want to stay in the car." Bill sat back, mildly offended, then shrugged it off entirely, because Bill's feelings had the shelf life of a mayfly. "Face painters. I like it. What do we paint on them?" Molly smiled for the first time all evening. "Dollar signs. Right on their little cheeks. Let their parents wonder why later."
created_at
"2026-07-01T07:22:47.706Z"
7/1/2026, 7:22:47 AM
model
"anthropic/claude-sonnet-5-20260630"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
2489
moderation_latency
(null)
generation_time
16658
tokens_prompt
77
tokens_completion
824
native_tokens_prompt
93
native_tokens_completion
1127
native_tokens_completion_images
(null)
native_tokens_reasoning
29
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.011456
router
(null)
provider_responses
0
endpoint_id
"37b16f45-eeae-45f3-9b85-2137f7c6d61b"
id
"msg_011hPCrEXfquzKD5XuAAge54"
is_byok
false
latency
1731
model_permaslug
"anthropic/claude-sonnet-5-20260630"
provider_name
"Amazon Bedrock"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.17.0; linux; x64))"
http_referer
(null)
request_id
"req-1782890567-7PwS6paGUrQkmpbMfNJM"
session_id
(null)
preset_id
(null)
api_type
"completions"
id
"gen-1782890567-Jrh38N7UstG5CHRRsPR6"
upstream_id
"msg_011hPCrEXfquzKD5XuAAge54"
total_cost
0.011456
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
response_cache_source_id
(null)
data_region
"global"
Evaluation details
Result
Evaluator
Details
Meta Data
27.17%
Matches word count
n/a
neededClean
false
words
481
30.15%
Dialogue to Total Word Ratio
Ratio: 74.90%, Deviation: 4.90%
neededClean
false
wordsTotal
486
wordsDialogue
364
28.6564%