NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-07-01-01KWE8Y6Q8RPY40ADJTJ6ZW5Z6
Run Details
Model:
Claude Sonnet 5 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
16.6s
Cost:
$0.011016
Provider:
Amazon Bedrock
Evaluation summary
Score
Evaluator
Details Link
60.28%
Matches word count
(details)
3.86%
Dialogue to Total Word Ratio
(details)
32.0667%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The back room of the Rusty Spoon smelled like stale beer and Bill's cologne, which was somehow worse. Molly spread the blueprints across the table, weighing down the corners with shot glasses. "So it's a Frozen theme," she said. "Elsa, Anna, the whole ice palace nonsense. Kid's turning seven." "Seven presents worth stealing from a seven-year-old's birthday party. Molly, are we really doing this? We robbed a bank last month." "The bank had four hundred dollars in the vault, Bill. Four hundred dollars! You know what's in that house right now? Forty-three presents. I counted the delivery trucks." "You counted delivery trucks." "I have a system." Bill rubbed his temples. "We're going to steal birthday presents from children. From a seven-year-old named—what's her name?" "Madison." "From Madison. This is what we've become." "This is what we've always been, Bill. We're thieves. This is just Tuesday with better wrapping paper." He picked up the blueprint, squinting at it upside down before Molly flipped it right-side up without a word. "There's gonna be parents everywhere. Bouncy castle. Probably a clown." "I've handled clowns before." "You've handled one clown, and he sued you." "He shouldn't have grabbed my crowbar." "You shouldn't have brought a crowbar to a children's party." "I bring a crowbar everywhere, Bill, it's not personal to the clown." Bill sat down heavily, the chair groaning under years of bad decisions. "How do we even get past the bouncy castle guy?" "There is no bouncy castle guy. It's self-inflate. I checked." "You checked the bouncy castle specs." "I have a system, Bill." "Your system is insane." "My system got us out of that jewelry store with our kneecaps intact, so maybe show some respect." He sighed, the particular sigh of a man who'd lost this argument before it started. "Fine. Say we get in. Say we dodge Madison's aunt Carol, who I guarantee is watching that present table like a hawk. How do we carry forty-three presents out without looking like we're carrying forty-three presents?" Molly grinned, reaching under the table for the duffel bags. "Pizza delivery." "We're stealing presents disguised as pizza delivery." "Big boxes. Nobody questions big boxes." "They're gonna question two adults showing up with forty pizzas to a seven-year-old's birthday party." "I'll say it's for the parents." "Molly." "What?" "There's a clown." "I told you, I've handled—" "He remembers you, Molly. You said his name was Bobo Sixty-Nine or something insane like that and he screamed it in the courtroom." Molly's grin faltered for just a second. "Different party. Different clown." "You don't know that." "I choose to believe that." Bill dropped his head into his hands. "We're going to prison for stuffed animals." "We're going to be rich from stuffed animals. There's a Barbie Dreamhouse on that list, Bill. Do you know what those go for on the secondary market?" "I hate that you know that." "I have a system."
created_at
"2026-07-01T07:22:48.437Z"
7/1/2026, 7:22:48 AM
model
"anthropic/claude-sonnet-5-20260630"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1758
moderation_latency
(null)
generation_time
16517
tokens_prompt
77
tokens_completion
808
native_tokens_prompt
93
native_tokens_completion
1083
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.011016
router
(null)
provider_responses
0
endpoint_id
"37b16f45-eeae-45f3-9b85-2137f7c6d61b"
id
"msg_01GWd8LBsNDqrtkvfBmMpsxb"
is_byok
false
latency
1758
model_permaslug
"anthropic/claude-sonnet-5-20260630"
provider_name
"Amazon Bedrock"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.17.0; linux; x64))"
http_referer
(null)
request_id
"req-1782890568-oOqc6rIX20DmxY7n7pU0"
session_id
(null)
preset_id
(null)
api_type
"completions"
id
"gen-1782890568-UTvDx0gYtQTU1p52JAeV"
upstream_id
"msg_01GWd8LBsNDqrtkvfBmMpsxb"
total_cost
0.011016
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
response_cache_source_id
(null)
data_region
"global"
Evaluation details
Result
Evaluator
Details
Meta Data
60.28%
Matches word count
n/a
neededClean
false
words
485
3.86%
Dialogue to Total Word Ratio
Ratio: 78.07%, Deviation: 8.07%
neededClean
false
wordsTotal
497
wordsDialogue
388
32.0667%