NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-07-01-01KWE9DR6A6FBYKG4427Y1H84Q
Run Details
Model:
Claude Sonnet 5 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
19.3s
Cost:
$0.012096
Provider:
Amazon Bedrock
Evaluation summary
Score
Evaluator
Details Link
100.00%
Matches word count
(details)
60.42%
Dialogue to Total Word Ratio
(details)
80.2083%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The van reeked of stale coffee and desperation, which Bill insisted was the smell of professionalism. Molly disagreed, mostly because she was the one who'd spilled the coffee three days ago and never cleaned it up. She sat cross-legged on a milk crate, studying a hand-drawn map of a suburban backyard like it was the Louvre floor plan. "Bill," she said, tapping the paper. "This says 'bouncy castle' right next to 'guard dog.' Which one are we more worried about?" "Depends," Bill said, adjusting a fake mustache that had already peeled off twice. "Is the dog inflatable?" "It's a Pomeranian named Duchess." "Then I fear nothing." Molly rubbed her temples. This was supposed to be simple. A seven-year-old's birthday party, forty presents, half of them still in shopping bags because the parents were too rich to bother wrapping. Easiest job of their careers, if Bill would just focus. "The plan," she said, "is we go in as entertainers. I'm a face painter. You're a clown." "I refuse to be a clown." "You already own the shoes." "Those are orthopedic." "They're floppy, Bill. They squeak when you walk." He crossed his arms, wounded. "That's a medical condition." Molly ignored him and unrolled a second sheet, this one covered in arrows and a suspicious amount of glitter. "Once we're inside, I distract the kids with balloon animals. You slip into the gift tent and load the presents into the trash bags marked 'recycling.' Nobody questions recycling." "And if someone asks why the clown is stealing gifts?" "You say it's a magic trick." "That's not a trick, that's a felony." "Same thing, different lighting." Bill sighed, the long, theatrical sigh of a man who had done many crimes but somehow always ended up the one wearing face paint. "What if there's a magician already booked? Real ones get territorial." "Then you improvise. Do the thing where you make a coin disappear." "I don't know that trick." "Then make a *present* disappear. Practice makes perfect." Molly reached into the glittery bag and pulled out a rainbow wig, tossing it at him. He caught it against his chest like it had personally insulted him. "I look ridiculous," he muttered, pulling it on anyway. "You look delightful. Very trustworthy. No parent suspects a man in a rainbow wig." "They should. That's exactly the kind of man they should suspect." "Bill, focus. Forty presents. Twenty minutes. We're in, we're out, we're rich in Lego sets and scented markers." "We are adults." "We are *entrepreneurs*." He looked at the map again, at the little stick figure labeled DUCHESS with a menacing scribble beside it. "Fine. But if that dog so much as sneezes near me, I'm out." "Deal." Molly zipped her balloon kit shut with finality. "Now go squeak into character, clown." Bill grumbled the whole way to the van door, shoes squeaking like a tiny orchestra of regret, muttering something about unionizing thieves for better working conditions. Molly just smiled, already counting presents she hadn't stolen yet.
created_at
"2026-07-01T07:31:17.843Z"
7/1/2026, 7:31:17 AM
model
"anthropic/claude-sonnet-5-20260630"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
3396
moderation_latency
(null)
generation_time
19225
tokens_prompt
77
tokens_completion
869
native_tokens_prompt
93
native_tokens_completion
1191
native_tokens_completion_images
(null)
native_tokens_reasoning
69
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.012096
router
(null)
provider_responses
0
endpoint_id
"37b16f45-eeae-45f3-9b85-2137f7c6d61b"
id
"msg_01DYdUqrs3NR6LNMxQpJrF3R"
is_byok
false
latency
1839
model_permaslug
"anthropic/claude-sonnet-5-20260630"
provider_name
"Amazon Bedrock"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.17.0; linux; x64))"
http_referer
(null)
request_id
"req-1782891077-rl5OD2OVMPuLGPmv9uxb"
session_id
(null)
preset_id
(null)
api_type
"completions"
id
"gen-1782891077-vvByBpRd5B6LWuWAB8hP"
upstream_id
"msg_01DYdUqrs3NR6LNMxQpJrF3R"
total_cost
0.012096
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
response_cache_source_id
(null)
data_region
"global"
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Matches word count
n/a
neededClean
false
words
500
60.42%
Dialogue to Total Word Ratio
Ratio: 46.83%, Deviation: 3.17%
neededClean
false
wordsTotal
504
wordsDialogue
236
80.2083%