NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-07-01-01KWE9ZYC4YQVMD7JJ279SS6B5
Run Details
Model:
Claude Sonnet 5 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
18.2s
Cost:
$0.011056
Provider:
Amazon Bedrock
Evaluation summary
Score
Evaluator
Details Link
99.98%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
49.9920%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The rain hammered against the window of the van as Molly spread the blueprints across her lap, though calling them "blueprints" was generous—it was actually a napkin with a crayon drawing that a six-year-old had abandoned at the diner they'd been casing from. "So the bounce house is here," Molly said, jabbing at a purple squiggle, "which means the present table is probably behind it, near the cake." Bill squinted at the napkin like it might reveal state secrets. "Molly, this is a drawing of what appears to be a dinosaur eating a birthday cake." "It's symbolic." "Symbolic of what? Our impending arrest?" Molly snatched the napkin back and smoothed it against her knee with the seriousness of a general reviewing troop movements. "Look, Bill, we've robbed banks. We've robbed a wax museum. We once stole a yacht from a man who was still standing on it." "That was a bad day for everyone involved." "My point is, we are professionals. A seven-year-old's birthday party is not going to be our Waterloo." Bill adjusted his rearview mirror, checking his fake mustache for the fourth time in ten minutes. It was slightly crooked, giving him the appearance of a confused walrus. "There's going to be parents there, Molly. Actual, alert, caffeinated parents. Not sleepy bank tellers. Not underpaid security guards. Soccer moms with pepper spray and a grudge." "Which is why we're going in as clowns." "I hate being a clown." "You hate being a clown because last time you scared a child so badly he needed therapy." "He startled easily!" "Bill, you popped out of a cake wielding a rubber chicken and yelling 'SURPRISE, MOTHERF—'" "I self-corrected mid-sentence!" "The damage was done." Bill sighed, tugging at his oversized rainbow wig, which smelled faintly of mothballs and questionable life choices. "Fine. Clowns. But if this goes sideways, I want it on record that this was your idea, and that I specifically warned you children have surprisingly strong bites." "Duly noted." Molly reached into the back seat, grabbing two duffel bags stuffed with balloon animals, party favors, and, discreetly tucked beneath a fake bottom, several sacks meant explicitly for loot. "Now remember, the plan is simple. We do a magic show, we make the presents 'disappear' as part of the act, and then we disappear too." "That's not a plan, that's a metaphor." "It's a metaphor with a getaway car." Bill turned the key, and the van rumbled reluctantly to life, sputtering black smoke like it, too, doubted this decision. "For the record," he muttered, "I still think this is the dumbest heist we've ever done." "Worse than the yacht?" "Nothing will ever top the yacht." "Then let's make some memories, partner." Molly grinned, popping open a can of confetti spray she intended to use as a diversionary weapon. "Six-year-olds won't know what hit them." "That's exactly what worries me," Bill muttered, pulling out into traffic, wig askew, mustache drooping, off toward a birthday party that had absolutely no idea what was coming.
created_at
"2026-07-01T07:41:13.997Z"
7/1/2026, 7:41:13 AM
model
"anthropic/claude-sonnet-5-20260630"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1692
moderation_latency
(null)
generation_time
18185
tokens_prompt
77
tokens_completion
797
native_tokens_prompt
93
native_tokens_completion
1087
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.011056
router
(null)
provider_responses
0
endpoint_id
"37b16f45-eeae-45f3-9b85-2137f7c6d61b"
id
"msg_01Kn7sBhd49N3CAGmEnDwMDJ"
is_byok
false
latency
1692
model_permaslug
"anthropic/claude-sonnet-5-20260630"
provider_name
"Amazon Bedrock"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.17.0; linux; x64))"
http_referer
(null)
request_id
"req-1782891674-ytZ18nn5I3dtj5vr5w9H"
session_id
(null)
preset_id
(null)
api_type
"completions"
id
"gen-1782891674-t126kds6wmJ1lJ5qxSQ6"
upstream_id
"msg_01Kn7sBhd49N3CAGmEnDwMDJ"
total_cost
0.011056
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
response_cache_source_id
(null)
data_region
"global"
Evaluation details
Result
Evaluator
Details
Meta Data
99.98%
Matches word count
n/a
neededClean
false
words
502
0.00%
Dialogue to Total Word Ratio
Ratio: 57.06%, Deviation: 27.06%
neededClean
false
wordsTotal
510
wordsDialogue
291
49.9920%