NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-07-01-01KWEADKEC8CKV60SGH8JXDTBG
Run Details
Model:
Claude Sonnet 5 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
16.7s
Cost:
$0.011896
Provider:
Amazon Bedrock
Evaluation summary
Score
Evaluator
Details Link
90.48%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
45.2419%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
# The Great Birthday Heist Molly spread the blueprints across the kitchen table, knocking over a coffee mug in the process. Bill caught it just before it soaked the plans. "Careful with the schematics," Molly snapped. "They're floor plans for a six-year-old's birthday party, Molly. Not the Louvre." "Every heist deserves respect, Bill. That's rule number one." Bill sighed and pulled up a chair. "Fine. Walk me through it." "The party starts at two. Bouncy castle goes up at noon. By one-thirty, there'll be a table absolutely loaded with presents—we're talking Nintendo Switches, remote control cars, probably some overpriced Lego set that costs more than my rent." "And we just... take them?" "We take them," Molly confirmed, tapping the blueprint with authority. "In and out in ninety seconds. Nobody even notices until the cake comes out." Bill leaned back, arms crossed. "You realize this is a birthday party for children. There will be parents. Possibly a clown." "Clowns are the least of our problems." "What's the actual plan, Molly? Not the dramatic version. The real one." Molly grinned, pulling out a folded costume from a duffel bag. "We dress as party entertainers. I'll be a magician. You'll be my lovely assistant." "Absolutely not." "It's a top hat and a cape, Bill, not a tutu." "I refuse to be anyone's lovely assistant." "Fine, you're the magician. I'll wear the cape." Bill grabbed the costume, inspecting it with visible disdain. "This smells like mothballs." "It's vintage. Adds authenticity." "It adds a rash, probably." "Focus, Bill. Once we're inside, I create a distraction—some card tricks, maybe pull a rabbit from a hat—" "Do we have a rabbit?" "We have a ferret. Close enough." Bill stared at her. "A ferret is not close enough to a rabbit." "Kids won't know the difference. They're six. They think dinosaurs still exist." "Dinosaurs did exist, Molly." "Not the point!" She smacked the table. "While everyone's distracted by the ferret—rabbit, whatever—you grab the present table and wheel it out the back." "Wheel it out? On what, a cart?" "I borrowed a shopping cart from the grocery store." "You stole a shopping cart to steal birthday presents." "When you say it like that, it sounds bad." "It sounds exactly like what it is." Molly ignored him, rolling up the blueprints with satisfaction. "This is foolproof, Bill. Foolproof." "Last week's plan was foolproof too. We ended up in a kiddie pool." "That was a tactical retreat." "You cried." "I was laughing so hard tears came out! Completely different thing." Bill rubbed his temples, already regretting every choice that led him here. "And if a parent notices two grown adults stealing an entire table of presents?" Molly shrugged, tossing him the cape. "Then we run. Really, really fast." "That's the backup plan? Running?" "It's a classic for a reason, Bill." He sighed, pulling the cape over his shoulders. "For the record, I hate this plan." "You hate every plan until it works." "They never work." Molly smiled, grabbing the ferret's cage. "There's a first time for everything."
created_at
"2026-07-01T07:48:41.559Z"
7/1/2026, 7:48:41 AM
model
"anthropic/claude-sonnet-5-20260630"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1603
moderation_latency
(null)
generation_time
16626
tokens_prompt
77
tokens_completion
848
native_tokens_prompt
93
native_tokens_completion
1171
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.011896
router
(null)
provider_responses
0
endpoint_id
"37b16f45-eeae-45f3-9b85-2137f7c6d61b"
id
"msg_01Mq6yvbZVLf53KLhbU1BeQv"
is_byok
false
latency
1601
model_permaslug
"anthropic/claude-sonnet-5-20260630"
provider_name
"Amazon Bedrock"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.17.0; linux; x64))"
http_referer
(null)
request_id
"req-1782892121-CF1VFP6Swsdkie7keHXi"
session_id
(null)
preset_id
(null)
api_type
"completions"
id
"gen-1782892121-9FsxA5Ush35koC7LEtSy"
upstream_id
"msg_01Mq6yvbZVLf53KLhbU1BeQv"
total_cost
0.011896
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
response_cache_source_id
(null)
data_region
"global"
Evaluation details
Result
Evaluator
Details
Meta Data
90.48%
Matches word count
n/a
neededClean
false
words
510
0.00%
Dialogue to Total Word Ratio
Ratio: 76.61%, Deviation: 26.61%
neededClean
false
wordsTotal
513
wordsDialogue
393
45.2419%