NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-07-01-01KWEA0NE6GW8R9B5C1M9VPD3S
Run Details
Model:
Claude Sonnet 5 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
18.0s
Cost:
$0.011486
Provider:
Amazon Bedrock
Evaluation summary
Score
Evaluator
Details Link
51.93%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
25.9628%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The lock-up smelled like motor oil and stale doughnuts, which was about right, since Bill kept his getaway van parked next to a box of them he refused to share. Molly sat cross-legged on a crate, studying a crayon drawing of a birthday cake like it was a bank vault schematic. "Explain to me," she said, "why we're casing a six-year-old's birthday party." Bill leaned over her shoulder, tapping the drawing with a grease-stained finger. "Because, Molly, that's not just cake. That's forty presents. Forty. I counted the invite list. Rich kid, rich parents, rich friends with rich parents who feel guilty about being rich." "You stole an invitation to a child's birthday party." "I intercepted it. There's a difference." Molly rubbed her temples. "Bill, we rob banks. We rob armored trucks. We once robbed a wedding, and I still feel bad about the cake topper." "This is basically the same thing, except easier, because the security is a bouncy castle and a guy in a dinosaur costume." "That's called a piñata guardian, Bill. He could have a machete in there." "It's a foam tail. Calm down." Molly stood, pacing between crates of questionable inventory—a disco ball, a stuffed emu, three traffic cones. "So what's the plan? We just walk in, dressed as clowns, and grab the loot bag?" "Better." Bill pulled out a folded blueprint that was actually a party invitation with arrows drawn in Sharpie. "We come in as the entertainment. I play the magician. You're my glamorous assistant." "Absolutely not." "You'll wear sequins." "I said absolutely not." "Fine, I'll wear the sequins." Molly snorted despite herself. "And while you're pulling scarves out of your sleeve, I sneak the presents into the van?" "Exactly. Kids are distracted by magic. Parents are distracted by wine. Nobody notices forty gift bags disappearing, because nobody counts presents at a party. It's basic psychology." "It's basic psychology because you read one article about magicians." "I read the whole website." Molly crossed her arms. "And what happens when Timmy—" "Tucker." "—Tucker notices his enormous pile of gifts is gone before he's even opened them?" Bill grinned, far too pleased with himself. "That's the beauty part. We leave one gift. A little robot. Kid'll be so hyped about the robot, he won't even remember the other thirty-nine." "You're betting the entire heist on a six-year-old's attention span." "Have you met a six-year-old? It's basically our best chance in the business." Molly sighed, already reaching for the box of sequins Bill had somehow procured from god knew where. "If we get caught, I'm blaming you. Loudly. In front of the dinosaur." "Deal," Bill said, throwing a scarf around his neck like a superhero cape. "Now, help me practice my card trick. If I can nail the disappearing queen, the disappearing presents will be no problem at all." "That's not how magic works." "That's exactly how magic works."
created_at
"2026-07-01T07:41:37.617Z"
7/1/2026, 7:41:37 AM
model
"anthropic/claude-sonnet-5-20260630"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
3470
moderation_latency
(null)
generation_time
17956
tokens_prompt
77
tokens_completion
836
native_tokens_prompt
93
native_tokens_completion
1130
native_tokens_completion_images
(null)
native_tokens_reasoning
57
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.011486
router
(null)
provider_responses
0
endpoint_id
"37b16f45-eeae-45f3-9b85-2137f7c6d61b"
id
"msg_01Le47ueFe2sdnGryheqi8uP"
is_byok
false
latency
1717
model_permaslug
"anthropic/claude-sonnet-5-20260630"
provider_name
"Amazon Bedrock"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.17.0; linux; x64))"
http_referer
(null)
request_id
"req-1782891697-WrlHKky1DJO0ItZLLByx"
session_id
(null)
preset_id
(null)
api_type
"completions"
id
"gen-1782891697-kQqzayqPKJ9b7Fh0j011"
upstream_id
"msg_01Le47ueFe2sdnGryheqi8uP"
total_cost
0.011486
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
response_cache_source_id
(null)
data_region
"global"
Evaluation details
Result
Evaluator
Details
Meta Data
51.93%
Matches word count
n/a
neededClean
false
words
484
0.00%
Dialogue to Total Word Ratio
Ratio: 66.80%, Deviation: 36.80%
neededClean
false
wordsTotal
494
wordsDialogue
330
25.9628%