NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-07-01-01KWEAA7DFN89117ZJS14T2NES
Run Details
Model:
Claude Sonnet 5 (Reasoning, Low)
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
18.1s
Cost:
$0.011036
Provider:
Amazon Bedrock
Evaluation summary
Score
Evaluator
Details Link
86.38%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
43.1902%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The warehouse smelled like motor oil and stale coffee, which suited Molly just fine. She spread the blueprints across the hood of the getaway car—a minivan with "Lisa's Bounce House Rentals" magnetized to the side—and jabbed a finger at the floor plan. "So the presents are here, in the living room, guarded by a six-year-old's grandmother," Molly said. Bill leaned over the hood, squinting. "Grandma's the muscle? That's your intel?" "She's got a hip replacement and a wooden spoon. Don't underestimate her." "I've robbed casinos, Molly. I think I can handle Nana." "You said that about the church bake sale, and Sister Agnes broke your nose with a rolling pin." Bill touched his nose reflexively. "That woman was possessed." "She was seventy-four and protecting a raffle basket." He waved this away. "Focus. Walk me through the plan again." Molly tapped the blueprint. "We go in as the entertainment. I'm the balloon animal lady, you're the magician." "I don't know any magic tricks." "You don't need real magic, Bill. You need a top hat and confidence." "What if a kid asks me to pull a rabbit out of it?" "Tell them the rabbit's shy. Kids believe anything if you say it seriously enough." Bill straightened his fake mustache, which was already peeling at one corner. "And once we're in?" "You keep the kids distracted with terrible card tricks while I load the presents into the party favor bags." "Party favor bags don't fit a bicycle, Molly." "Which is why we brought the wagon disguised as a piñata." Bill stared at her. "We're stuffing a stolen bicycle into a piñata." "It's not stolen yet. And yes." "This is the dumbest heist we've ever pulled." "Worse than the wedding cake job?" "That cake had actual gold leaf in it. This is a seven-year-old's birthday party." "Turning eight," Molly corrected. "And there's a PlayStation 5 on that gift table, Bill. Some kid's rich uncle went all out." Bill's eyes lit up despite himself. "The new one? With the—" "With everything. Focus." "Right. Sorry." He cracked his knuckles, suddenly serious. "What's our exit strategy?" "We say we have another party to get to. Works every time." "And if someone recognizes us from the news? We did rob a bank last Tuesday." "In a bounce house rental van? Nobody's going to suspect two children's entertainers of also being wanted felons." "You'd be surprised what people suspect these days." Molly rolled up the blueprints and shoved them under her arm. "Just smile, do your terrible card tricks, and let me handle the heavy lifting. Metaphorically. The bicycle's actually pretty heavy." "What if Grandma catches you?" "Then I'll offer her a balloon poodle as a peace treaty." Bill grabbed his top hat from the passenger seat, turning it over in his hands like it might bite him. "For the record, I still think this is beneath us." "It's a PS5, Bill." "...I'll get the piñata."
created_at
"2026-07-01T07:46:50.939Z"
7/1/2026, 7:46:50 AM
model
"anthropic/claude-sonnet-5-20260630"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1560
moderation_latency
(null)
generation_time
18025
tokens_prompt
77
tokens_completion
784
native_tokens_prompt
93
native_tokens_completion
1085
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.011036
router
(null)
provider_responses
0
endpoint_id
"37b16f45-eeae-45f3-9b85-2137f7c6d61b"
id
"msg_01Ef3zniu7fAVi8V5tfKFMNF"
is_byok
false
latency
1560
model_permaslug
"anthropic/claude-sonnet-5-20260630"
provider_name
"Amazon Bedrock"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.17.0; linux; x64))"
http_referer
(null)
request_id
"req-1782892010-YH1TWNAdrwQzxkrG3S9A"
session_id
(null)
preset_id
(null)
api_type
"completions"
id
"gen-1782892010-YnUDhdaS0iVcBlYbsaIU"
upstream_id
"msg_01Ef3zniu7fAVi8V5tfKFMNF"
total_cost
0.011036
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
response_cache_source_id
(null)
data_region
"global"
Evaluation details
Result
Evaluator
Details
Meta Data
86.38%
Matches word count
n/a
neededClean
false
words
489
0.00%
Dialogue to Total Word Ratio
Ratio: 75.30%, Deviation: 25.30%
neededClean
false
wordsTotal
494
wordsDialogue
372
43.1902%