NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-07-01-01KWE94EQTPR0P9KT6RYWNPFK4
Run Details
Model:
Claude Sonnet 5 (Reasoning, Low)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
19.9s
Cost:
$0.011146
Provider:
Amazon Bedrock
Evaluation summary
Score
Evaluator
Details Link
51.93%
Matches word count
(details)
0.10%
Dialogue to Total Word Ratio
(details)
26.0123%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The back room of Bill's Pawn & Loan smelled like mothballs and desperation, which Molly figured was appropriate given their current career trajectory. "A kid's birthday party," she said, staring at the blueprint Bill had drawn in crayon. "We've robbed banks, Bill. Actual banks. With vaults." "Banks got security guards. This is a six-year-old's birthday party at a bounce house facility called Jumpin' Jimmy's. Worst case scenario, we get tackled by a golden retriever." "That's not comforting, that's concerning. When did we start planning heists around bounce houses?" "Since the market for stolen bearer bonds dried up. Look, presents, Molly. Fifty kids, fifty presents, all wrapped and unattended in a designated 'gift zone' while the parents drink boxed wine and pretend to enjoy watching children scream for three hours." "You've done recon." "I've done recon." Molly picked up the crayon drawing, turning it sideways like that might help. It didn't. "This says 'candy cannon' with three question marks." "There's a candy cannon, Molly. I don't know what it does. It's load-bearing to the plan somehow, I just haven't figured out how yet." "Great. Solid foundation." She set it down. "What's our cover?" "Balloon artists." "We don't know how to make balloon animals." "Nobody knows how to make balloon animals, Molly, that's the beauty of it. You just twist until it looks like a dog and cry a little when it doesn't." "I'm not crying at a children's party." "You cried at the Fitzgerald heist when the alarm went off." "That was different, I stubbed my toe on a marble bust worth eleven thousand dollars, there were extenuating circumstances." Bill sighed, rubbing his temples like she was the unreasonable one here. "Fine. No crying required. We show up, we're 'Bonzo and Twisty,' we make bad balloon animals as a distraction, and while every kid in a ten-foot radius is watching us fail to construct a giraffe, I slip into the gift zone and load presents into the diaper bag." "You're carrying a diaper bag." "It's got a lot of room, Molly. Practical design. Whoever invented that thing was thinking about heists." "They were thinking about diapers, Bill." "Same energy." She dragged a hand down her face. "And if a parent notices fifty presents just, evaporating from a table?" "That's why we take them gradually. One or two at a time. By the time cake gets cut, we're gone, they're short a few Paw Patrol sets, and everybody just assumes some kid's older sibling was a jerk." "That's your whole plan. Blame a fictional jerk sibling." "There's always a jerk sibling, Molly. Statistically. It's a solid alibi." She looked at him for a long moment, then at the crayon map, then back at him. "We're really doing this." "Bonzo and Twisty ride again." "I want it on record that I think the candy cannon is going to be our undoing." "Duly noted. Now, practice your giraffe."
created_at
"2026-07-01T07:26:13.255Z"
7/1/2026, 7:26:13 AM
model
"anthropic/claude-sonnet-5-20260630"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
4651
moderation_latency
(null)
generation_time
19865
tokens_prompt
77
tokens_completion
769
native_tokens_prompt
93
native_tokens_completion
1096
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.011146
router
(null)
provider_responses
0
endpoint_id
"37b16f45-eeae-45f3-9b85-2137f7c6d61b"
id
"msg_01KYnJr3h4vsxfA4imbDLAM7"
is_byok
false
latency
3355
model_permaslug
"anthropic/claude-sonnet-5-20260630"
provider_name
"Amazon Bedrock"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.17.0; linux; x64))"
http_referer
(null)
request_id
"req-1782890773-e09JyXW1tXqS5Z6Qfe7r"
session_id
(null)
preset_id
(null)
api_type
"completions"
id
"gen-1782890773-UzZqAQvXDkm8kSbCzegk"
upstream_id
"msg_01KYnJr3h4vsxfA4imbDLAM7"
total_cost
0.011146
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
response_cache_source_id
(null)
data_region
"global"
Evaluation details
Result
Evaluator
Details
Meta Data
51.93%
Matches word count
n/a
neededClean
false
words
484
0.10%
Dialogue to Total Word Ratio
Ratio: 81.76%, Deviation: 11.76%
neededClean
false
wordsTotal
488
wordsDialogue
399
26.0123%