NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-07-01-01KWE9JR6QM8QGN586ZH0JNMCC
Run Details
Model:
Claude Sonnet 5 (Reasoning, Low)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
18.6s
Cost:
$0.011256
Provider:
Amazon Bedrock
Evaluation summary
Score
Evaluator
Details Link
93.65%
Matches word count
(details)
45.21%
Dialogue to Total Word Ratio
(details)
69.4318%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The back room of Bill's apartment smelled like stale pizza and desperation, which Molly figured was about right for their current career trajectory. "A kid's birthday party," she said again, just to make sure she'd heard correctly. "A kid's birthday party," Bill confirmed, spreading a hand-drawn map across the table like he was General Patton planning D-Day. "We're stealing presents from children." "We're liberating presents from children who have too many presents already, Molly. Have you seen how these rich kids live? Kid's turning seven and getting a real pony. A pony, Molly." "So we're Robin Hood now." "Robin Hood, but for LEGO sets." Molly rubbed her temples. "How did we even get this intel?" "My cousin Deb does the balloon animals. Says this family, the Wentworths, they go all out. Sixty kids, a bounce house shaped like a castle, and a present table that's basically a small mountain range." "And the plan is what, exactly? We just walk in?" "We walk in dressed as clowns." "Absolutely not." "Face painters?" "Bill." "Okay, hear me out." He pulled out a second sheet of paper, this one with a crude drawing of what might have been a truck, or possibly a very angry rectangle. "We rent a van, we put 'WENTWORTH PARTY RENTALS' on the side, and we drive right up like we're there to collect the tables and chairs after everything's done." "And nobody stops us." "Nobody stops the rental guys, Molly! Nobody even looks at the rental guys! We're invisible! We're ghosts! We're—" "Bill, there will be sixty children there. Sixty children and probably forty parents, all of whom are going to notice two adults loading a bounce house's worth of presents into an unmarked van." "It'll be marked! I just said—" "With a rectangle." "It's shorthand! I'll make it nicer!" Molly stared at the ceiling like it might offer better career advice. "What happened to the diamond job? The one you said was 'basically guaranteed'?" "That fell through." "What happened to the yacht?" "Also fell through. Look, security's tight on yachts now, everybody's got cameras—" "And you thought, 'you know what has less security than a yacht? A seven-year-old's birthday party.'" "Statistically, yes!" "There's no statistics, Bill, you made that up." "The point is nobody expects it! That's our whole advantage! Who steals presents from kids?" "Evil people, Bill. Evil people steal presents from kids." "Interesting people, Molly. Complicated people. People doing a bit." She picked up his terrible little map and studied it, mostly to avoid looking at his hopeful, ridiculous face. There was a pony sketched in the corner with the word "DANGER??" beside it. "If we do this," she said slowly, "and I am not saying we're doing this—" "We're doing this." "—we're not taking anything a kid actually asked for by name. No signed jerseys, no specific dolls. Wrapping paper roulette only. We take our chances." Bill's grin spread slow and delighted, like a man who'd just won an argument he didn't deserve to win. "Robin Hood," he said, "but for LEGO sets."
created_at
"2026-07-01T07:34:01.698Z"
7/1/2026, 7:34:01 AM
model
"anthropic/claude-sonnet-5-20260630"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1852
moderation_latency
(null)
generation_time
18588
tokens_prompt
77
tokens_completion
826
native_tokens_prompt
93
native_tokens_completion
1107
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.011256
router
(null)
provider_responses
0
endpoint_id
"37b16f45-eeae-45f3-9b85-2137f7c6d61b"
id
"msg_01BskMCdebH3oSBXTtsHMJkK"
is_byok
false
latency
1852
model_permaslug
"anthropic/claude-sonnet-5-20260630"
provider_name
"Amazon Bedrock"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.17.0; linux; x64))"
http_referer
(null)
request_id
"req-1782891241-cUtsn24sW8r02vMrWAez"
session_id
(null)
preset_id
(null)
api_type
"completions"
id
"gen-1782891241-uHIy5EmkasqA6pINHPTX"
upstream_id
"msg_01BskMCdebH3oSBXTtsHMJkK"
total_cost
0.011256
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
response_cache_source_id
(null)
data_region
"global"
Evaluation details
Result
Evaluator
Details
Meta Data
93.65%
Matches word count
n/a
neededClean
false
words
509
45.21%
Dialogue to Total Word Ratio
Ratio: 66.02%, Deviation: 3.98%
neededClean
false
wordsTotal
512
wordsDialogue
338
69.4318%