NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-07-01-01KWE9XZSRG3SXK891QQ0Y210N
Run Details
Model:
Claude Sonnet 5
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
15.8s
Cost:
$0.010466
Provider:
Amazon Bedrock
Evaluation summary
Score
Evaluator
Details Link
68.10%
Matches word count
(details)
0.20%
Dialogue to Total Word Ratio
(details)
34.1518%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The rain hammered against the windshield as Molly traced her finger along a hand-drawn map spread across the dashboard. Bill sat beside her, chewing on a stick of licorice like it owed him money. "So let me get this straight," Bill said, not looking up from his candy. "We're robbing a six-year-old's birthday party." "We're not robbing the six-year-old," Molly said. "We're robbing the presents. There's a difference." "Is there?" "Bill. There is a mountain of unopened gifts sitting in that house. Bicycles. Video game consoles. I heard the kid's uncle got him an actual drone." "A drone." Bill sat up a little straighter. "For a six-year-old?" "Rich people don't think, they just buy." "Fine. So what's the plan? We can't just walk in the front door dressed like this." He gestured at his black turtleneck, which made him look less like a party guest and more like a man who'd wandered off the set of a heist movie, which, in fairness, he had. "We're not walking in the front door. We're going in as the entertainment." Bill blinked. "The entertainment." "I called the party planner yesterday, pretending to be from Balloon Kingdom. Told her their clown canceled. She was thrilled. Apparently the actual clown got food poisoning." "You poisoned a clown?" "I did not poison the clown. The universe poisoned the clown. I just took advantage of it." Bill considered this with the seriousness of a man weighing a great moral dilemma, then shrugged and popped another piece of licorice in his mouth. "Okay. So I'm the clown." "You're the clown. I'm your assistant. While you're twisting balloon animals and humiliating yourself in front of forty screaming children, I slip into the gift room and load everything into the utility bag." "Why do I have to be the clown?" "Because I can't do a balloon dog to save my life, and you took that mime class in college." "That was one semester." "You still remember how to walk against the wind, Bill. That's a skill. Use it." He sighed, the long, put-upon sigh of a man who had made poor life choices and was now facing the consequences of one particularly poor one. "And if someone notices the presents are gone?" "They won't. Not until the party's over. Sugar-high kids and day-drinking parents don't count inventory." "You've thought this through." "I think everything through. That's why I'm the brains." "You're the brains. I'm the clown. Feels like an uneven split." "You get thirty percent." "Thirty—Molly, that drone is worth more than my car." "Then don't crash your car." She folded the map, tucked it into her jacket, and reached into the back seat for the red rubber nose sitting on top of a folded polka-dot costume. She held it out to him like a peace offering, or possibly a grenade. Bill stared at it a long moment before taking it. "If this goes wrong," he said, sliding it onto his nose, "I'm blaming you at the trial." "There won't be a trial," Molly said, starting the car. "There will be a drone."
created_at
"2026-07-01T07:40:09.925Z"
7/1/2026, 7:40:09 AM
model
"anthropic/claude-sonnet-5-20260630"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1686
moderation_latency
(null)
generation_time
15740
tokens_prompt
77
tokens_completion
791
native_tokens_prompt
93
native_tokens_completion
1028
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.010466
router
(null)
provider_responses
0
endpoint_id
"37b16f45-eeae-45f3-9b85-2137f7c6d61b"
id
"msg_0112DQoDewLX42tqtNaM8Peb"
is_byok
false
latency
1686
model_permaslug
"anthropic/claude-sonnet-5-20260630"
provider_name
"Amazon Bedrock"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.17.0; linux; x64))"
http_referer
(null)
request_id
"req-1782891609-4hMTp0a4J7JGgc2vxKNu"
session_id
(null)
preset_id
(null)
api_type
"completions"
id
"gen-1782891609-4yUwLsngERwOabxdQFUM"
upstream_id
"msg_0112DQoDewLX42tqtNaM8Peb"
total_cost
0.010466
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
response_cache_source_id
(null)
data_region
"global"
Evaluation details
Result
Evaluator
Details
Meta Data
68.10%
Matches word count
n/a
neededClean
false
words
514
0.20%
Dialogue to Total Word Ratio
Ratio: 61.14%, Deviation: 11.14%
neededClean
false
wordsTotal
525
wordsDialogue
321
34.1518%