NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-07-01-01KWE97G60VF1835V1HYGF1JFP
Run Details
Model:
Claude Sonnet 5
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
20.3s
Cost:
$0.010946
Provider:
Amazon Bedrock
Evaluation summary
Score
Evaluator
Details Link
81.27%
Matches word count
(details)
0.04%
Dialogue to Total Word Ratio
(details)
40.6579%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The van smelled like feet and stale donuts, which Bill insisted was "atmospheric" and Molly insisted was "disgusting." They sat parked across from a house strangled in balloons, a giant inflatable dinosaur guarding the front lawn like some kind of neon sentinel. "Okay," Bill said, spreading a hand-drawn map across the dashboard. It looked like it had been drawn by a toddler having a seizure. "This is the house. This is the backyard. This squiggle is the bounce house." "Why does the bounce house have a face?" "I panicked." Molly rubbed her temples. She'd robbed banks. She'd robbed casinos. She'd once robbed a man's dignity by convincing him his own wife had hired her to test his home security, which technically wasn't a lie. But this—this felt different. Lower. Grubbier. And yet the payout was too good to pass up: some tech billionaire's kid was turning seven, and the presents piling up in that living room were rumored to include a solid gold Nintendo Switch, actual gold, because apparently regular gold wasn't tacky enough. "We're stealing from children, Bill." "We're stealing from children's *parents*," Bill corrected, offended. "The kids won't even remember what they got. I didn't remember my seventh birthday. I don't even remember most of my thirties." "That explains a lot." "Focus. The plan is simple. I go in dressed as the entertainment—" "You're dressed as a clown." "I'm dressed as *Zippy the Balloon Wizard*, and I take requests. While the parents are watching me twist balloon dogs, you slip in through the side door, grab the presents, load them into the wheelbarrow—" "The wheelbarrow." "It was the only thing that fit in the van." "You brought a *wheelbarrow* to a heist." "It has a squeaky wheel, which honestly adds to the stealth, because everyone will assume it's part of the bounce house sound effects." Molly stared at him for a long, silent moment, the kind of stare usually reserved for people who tell you they've invested their savings in a shrimp-based cryptocurrency. "There is something deeply wrong with you." "Thank you." She sighed, checked her watch, adjusted the fake mustache she'd be wearing as "cousin Debra's boyfriend, Gary," and popped open the van door. The dinosaur balloon swayed ominously in the wind, as if warning them to turn back. They ignored it. Thieves always ignore the omens; it's basically a professional requirement. "Just remember," Molly said, hoisting a duffel bag of zip ties and wrapping paper for restaging, "if any kid asks you for a balloon giraffe, you say no. We are not here to enrich anyone's childhood." "What if they cry?" "Then you make a balloon giraffe, Bill, obviously. We're thieves, not monsters." Bill grinned, snapped on his rainbow wig, and marched toward the house whistling a tune that sounded suspiciously like a funeral march remixed for kazoo. Somewhere behind them, the wheelbarrow squeaked once, ominous and squeaky, like the world's saddest doorbell.
created_at
"2026-07-01T07:27:53.035Z"
7/1/2026, 7:27:53 AM
model
"anthropic/claude-sonnet-5-20260630"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
3288
moderation_latency
(null)
generation_time
20222
tokens_prompt
77
tokens_completion
778
native_tokens_prompt
93
native_tokens_completion
1076
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.010946
router
(null)
provider_responses
0
endpoint_id
"37b16f45-eeae-45f3-9b85-2137f7c6d61b"
id
"msg_01NczDKHtJJdGurxAtweSrQu"
is_byok
false
latency
3288
model_permaslug
"anthropic/claude-sonnet-5-20260630"
provider_name
"Amazon Bedrock"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.17.0; linux; x64))"
http_referer
(null)
request_id
"req-1782890873-9V2uAxgK0tZ4DrXerlGO"
session_id
(null)
preset_id
(null)
api_type
"completions"
id
"gen-1782890873-ostCS7qCDNTt7Gtdlm2q"
upstream_id
"msg_01NczDKHtJJdGurxAtweSrQu"
total_cost
0.010946
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
response_cache_source_id
(null)
data_region
"global"
Evaluation details
Result
Evaluator
Details
Meta Data
81.27%
Matches word count
n/a
neededClean
false
words
488
0.04%
Dialogue to Total Word Ratio
Ratio: 42.45%, Deviation: 12.45%
neededClean
false
wordsTotal
490
wordsDialogue
208
40.6579%