NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-07-01-01KWEAD1KGM9WC1TRAPSD03WYR
Run Details
Model:
Claude Sonnet 5
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
18.0s
Cost:
$0.010866
Provider:
Amazon Bedrock
Evaluation summary
Score
Evaluator
Details Link
98.71%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
49.3562%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The lockup smelled like motor oil and stale donuts, which was exactly the kind of place Bill liked to plan a heist. He had the blueprints spread across the hood of the getaway van—a minivan, actually, stolen from a suburban cul-de-sac and still smelling faintly of goldfish crackers. "A six-year-old's birthday party," Molly said, staring at the paper like it might bite her. "Bill. We rob banks. We rob art museums. We do not rob six-year-olds." "Think bigger, Mol." Bill tapped the blueprint with a greasy finger. "This ain't just any six-year-old. This is Tucker Whitfield the Third. His daddy owns half the shipping containers in this city. You know what rich people do for their kids' birthdays now? They don't just buy presents. They buy *experiences.* Ponies. Magicians. A bounce house shaped like a castle, for God's sake." "So we're stealing a bounce house?" "We're stealing the *presents,* Molly, keep up. You know what kind of loot gets piled up at a party like this? Watches. Gold-plated Legos. I heard the kid's getting an actual mini Ferrari he can drive around the yard." Molly crossed her arms. "And how exactly do two grown adults waltz into a kid's birthday party without every helicopter parent in a four-mile radius calling the cops?" Bill grinned, the same grin he got right before every terrible idea he'd ever had. He reached into a duffel bag and pulled out two costumes—a fuzzy pink rabbit and something resembling a purple dinosaur with a suspicious bulge where the tail should be. "Entertainment," he said. "I already called the party planner posing as 'Big Top Talent Agency.' Told her the real clowns canceled. She was thrilled. Said and I quote, 'thank God, the kids were about to riot.'" Molly picked up the dinosaur costume between two fingers like it was radioactive. "There's a zipper pocket in the tail." "For the loot. Obviously." "Bill, this is insane, even for us." "Mol, when have I ever steered us wrong?" "The Denver casino. The wax museum. That one time with the emu—" "Okay, statistically speaking, mostly wrong. But this one's different! Nobody suspects a six-foot rabbit of grand larceny. We do a couple magic tricks, juggle some balloons, and while everyone's watching us pull scarves out of our sleeves, we're pulling gift bags into our costumes. In and out in twenty minutes, we vanish before cake, and the kid gets to blame it on 'the entertainment' running off with the loot bags. Genius." Molly held the rabbit head up to her face, studying its glassy, slightly deranged eyes. "If I get caught wearing this thing, I'm blaming you." "Deal." Bill zipped the dinosaur suit's tail experimentally. "Now come on, Fluffy. We've got a princess bounce castle to rob." She sighed, already regretting every choice that had led her here. "For the record, if there's a clown at this party too, I'm out." "There's always a clown, Molly. There's always a clown."
created_at
"2026-07-01T07:48:23.292Z"
7/1/2026, 7:48:23 AM
model
"anthropic/claude-sonnet-5-20260630"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1536
moderation_latency
(null)
generation_time
17991
tokens_prompt
77
tokens_completion
757
native_tokens_prompt
93
native_tokens_completion
1068
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.010866
router
(null)
provider_responses
0
endpoint_id
"37b16f45-eeae-45f3-9b85-2137f7c6d61b"
id
"msg_01Y45EipHcUCFtJ4WomNhCLp"
is_byok
false
latency
1536
model_permaslug
"anthropic/claude-sonnet-5-20260630"
provider_name
"Amazon Bedrock"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.17.0; linux; x64))"
http_referer
(null)
request_id
"req-1782892103-ZwTPaegvx37EESoeGfBy"
session_id
(null)
preset_id
(null)
api_type
"completions"
id
"gen-1782892103-k5oAOvjE3F7ai4QJzNZi"
upstream_id
"msg_01Y45EipHcUCFtJ4WomNhCLp"
total_cost
0.010866
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
response_cache_source_id
(null)
data_region
"global"
Evaluation details
Result
Evaluator
Details
Meta Data
98.71%
Matches word count
n/a
neededClean
false
words
494
0.00%
Dialogue to Total Word Ratio
Ratio: 66.93%, Deviation: 36.93%
neededClean
false
wordsTotal
505
wordsDialogue
338
49.3562%