NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-07-01-01KWEAAAH13XGJF28Q0S3835FJ
Run Details
Model:
Claude Sonnet 5 (Reasoning, Low)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
16.2s
Cost:
$0.010256
Provider:
Amazon Bedrock
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
4.25%
Dialogue to Total Word Ratio
(details)
2.1251%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The warehouse smelled like old rope and regret, which Bill figured was appropriate given their line of work. He spread the blueprints across a crate while Molly picked at a hangnail with the focus of a surgeon. "So the party's at 2 PM," Bill said. "Kid's turning seven. Bouncy castle, clown, the whole nightmare." "A clown," Molly said. "You didn't tell me there'd be a clown." "Does it matter?" "Bill, clowns see everything. It's in their contract or something. Ancient clown law." "There's no ancient clown law." "How do you know? You a clown historian now?" Bill rubbed his temples. "Can we focus? The presents are gonna be stacked on a table by the cake. We slip in during the pin-the-tail-on-the-donkey game, grab the loot, slip out." "Slip out how? Through the bouncy castle?" "Through the side gate, obviously." "Because that's not suspicious at all. Two grown adults casually strolling out of a seven-year-old's birthday party carrying forty wrapped presents." "We'll say we're catering." "Catering what, felonies?" Bill exhaled slowly, the way he did when he was regretting his life choices, which was often. "Look, we case the yard, we time the games, we're in and out in twelve minutes." "Twelve minutes to steal a mountain of Legos and, like, one bad sweater from Grandma." "There could be an iPad in there. Kids get iPads now." "For their SEVENTH birthday?" "Times have changed, Molly." "You know what hasn't changed? Us robbing children. That's a new low even for us." "We're not robbing the children. We're robbing the presents. Completely different." Molly stared at him. "That is not completely different. That is the same thing with extra steps." "The presents aren't sentient, Molly. They don't have feelings." "The KIDS have feelings, Bill! You're going to make some seven-year-old cry on his birthday!" "He'll get over it. Kids are resilient. It's basically their whole thing." Molly threw up her hands. "Fine. FINE. But if that clown looks at me funny, I'm bailing." "Deal." "And if there's a pony, I want the pony." "We are not stealing a pony from a birthday party." "Then I want hazard pay for pony-adjacent trauma." "You don't even know what that means." "Neither do you, but it sounds official." Bill folded up the blueprint, already exhausted, and they hadn't even left the warehouse yet. "Just wear the caterer outfit and stay quiet." "I'm always quiet." "You're never quiet." "I was quiet that one time in Reno." "You screamed at a pigeon." "It came at me aggressively!" "It was eating a chip." "It was eating MY chip, Bill. There's a difference." He grabbed his jacket, already dreading whatever chaos the afternoon held, clown law and all.
created_at
"2026-07-01T07:46:54.125Z"
7/1/2026, 7:46:54 AM
model
"anthropic/claude-sonnet-5-20260630"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1751
moderation_latency
(null)
generation_time
16137
tokens_prompt
77
tokens_completion
736
native_tokens_prompt
93
native_tokens_completion
1007
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.010256
router
(null)
provider_responses
0
endpoint_id
"37b16f45-eeae-45f3-9b85-2137f7c6d61b"
id
"msg_01RZE899PPuDtxzf4r41FKb2"
is_byok
false
latency
1751
model_permaslug
"anthropic/claude-sonnet-5-20260630"
provider_name
"Amazon Bedrock"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.17.0; linux; x64))"
http_referer
(null)
request_id
"req-1782892014-YZKN2EqARsFVRfmit2Cx"
session_id
(null)
preset_id
(null)
api_type
"completions"
id
"gen-1782892014-qo2InWKSXLP731QUM0Z9"
upstream_id
"msg_01RZE899PPuDtxzf4r41FKb2"
total_cost
0.010256
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
response_cache_source_id
(null)
data_region
"global"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
448
4.25%
Dialogue to Total Word Ratio
Ratio: 77.95%, Deviation: 7.95%
neededClean
false
wordsTotal
458
wordsDialogue
357
2.1251%