NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-07-01-01KWE9PMG8N4GEW6PV6NFWSDM1
Run Details
Model:
Claude Sonnet 5
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
15.6s
Cost:
$0.010426
Provider:
Amazon Bedrock
Evaluation summary
Score
Evaluator
Details Link
60.28%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
30.1376%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The rain hammered against the window of the van, and Bill traced a greasy finger across the blueprint spread over his knees—a hand-drawn map of the Whitmore residence, complete with a large red circle around the backyard where, in six hours, forty seven-year-olds would descend upon a bounce house and a mountain of wrapped presents. "I'm just saying," Molly said, not looking up from her binoculars, "there's something deeply wrong with us." "There's a lot wrong with us," Bill said. "Be specific." "We're planning to rob a child's birthday party." "We're planning to rob the *presents* at a child's birthday party. The child will be fine. The child will not even remember this happened by Tuesday." "That's not the point." "Then what's the point?" Molly lowered the binoculars and looked at him. "The point, Bill, is that somewhere down the line, we became the kind of people who case a six-year-old's birthday party like it's a bank." "We've cased actual banks. This is significantly less dangerous." "Banks don't have clowns." "This party doesn't have a clown. I checked. It's a magician. Card tricks, that sort of thing." "Oh, well, that changes everything," Molly said. "Suddenly I feel much better about our life choices." Bill folded the blueprint and tossed it onto the dashboard. "Look, the Whitmores have forty guests. Forty guests means forty presents, minimum, most of them expensive because rich people can't just buy a kid a coloring book, they buy him a drone. We slip in during the magic show, everyone's watching some guy pull a rabbit out of a hat, we load the presents into the laundry bag, we're gone in four minutes." "And if someone sees us?" "We're the caterers." "We don't have catering uniforms." "I bought two polo shirts that say 'Whitmore Event Staff' this morning." Molly stared at him. "You had those printed?" "There's a place by the gas station. Twenty minutes, no questions asked." "That's either the most impressive or the most concerning thing you've told me all week." "Can it be both?" "It can be both." She sighed and pulled her hood up. "Fine. Four minutes. In and out. No talking to children." "Obviously no talking to children." "No taking the birthday boy's present specifically. That's a line." "I wasn't going to—" "I saw you circle his gift on the list, Bill." "It was a *drone*, Molly. A really nice drone." "We are not stealing the birthday boy's drone." "Fine," Bill grumbled, starting the engine. "But if it's just sitting there unattended, unsupervised, practically begging—" "Bill." "I'm not saying we take it. I'm saying the universe would understand." "The universe," Molly said, "already thinks we're going to hell. Let's not give it extra reasons." Bill grinned and pulled the van out into the rain, the wipers thumping steadily as the Whitmore house came into view, balloons already tied to the mailbox, oblivious to what was coming.
created_at
"2026-07-01T07:36:08.98Z"
7/1/2026, 7:36:08 AM
model
"anthropic/claude-sonnet-5-20260630"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1460
moderation_latency
(null)
generation_time
15580
tokens_prompt
77
tokens_completion
781
native_tokens_prompt
93
native_tokens_completion
1024
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.010426
router
(null)
provider_responses
0
endpoint_id
"37b16f45-eeae-45f3-9b85-2137f7c6d61b"
id
"msg_01UMHJr5qRgekQimSVnWV1jK"
is_byok
false
latency
1460
model_permaslug
"anthropic/claude-sonnet-5-20260630"
provider_name
"Amazon Bedrock"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.17.0; linux; x64))"
http_referer
(null)
request_id
"req-1782891368-AvZ1qP5stH1aq83SQ3cq"
session_id
(null)
preset_id
(null)
api_type
"completions"
id
"gen-1782891368-bswcdCRfQ0VaMEgTtrri"
upstream_id
"msg_01UMHJr5qRgekQimSVnWV1jK"
total_cost
0.010426
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
response_cache_source_id
(null)
data_region
"global"
Evaluation details
Result
Evaluator
Details
Meta Data
60.28%
Matches word count
n/a
neededClean
false
words
485
0.00%
Dialogue to Total Word Ratio
Ratio: 71.63%, Deviation: 21.63%
neededClean
false
wordsTotal
490
wordsDialogue
351
30.1376%