NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-07-01-01KWEA6VW85HJMCMTY6ETKATWT
Run Details
Model:
Claude Sonnet 5 (Reasoning, Low)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
19.1s
Cost:
$0.012116
Provider:
Amazon Bedrock
Evaluation summary
Score
Evaluator
Details Link
43.38%
Matches word count
(details)
9.23%
Dialogue to Total Word Ratio
(details)
26.3065%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The back room of Bill's apartment smelled like stale pizza and desperation, which felt about right for what they were about to do. "So let me get this straight," Molly said, tapping a photograph of a birthday cake shaped like a dinosaur. "You want to rob a six-year-old's birthday party." "Not rob," Bill said, offended. "Liberate. Those presents are gonna get destroyed by nine a.m. anyway. Wrapping paper everywhere, kids screaming, some little monster crying because he got socks. We're doing everyone a favor." "A favor. Right. And how exactly do two grown adults just waltz into a kid's birthday party without every parent in a fifty-foot radius calling the cops?" Bill grinned and pulled out two costumes from a duffel bag. One was a slightly moth-eaten dinosaur suit. The other was a clown costume with a wig so red it looked radioactive. Molly stared at them like they'd personally insulted her mother. "Absolutely not." "Come on, Moll, it's perfect! Kids love clowns and dinosaurs!" "Kids are also *terrified* of clowns and dinosaurs, Bill. Have you never seen a horror movie?" "That's exactly why nobody will question us! They'll just think we're the world's worst party entertainment!" Molly rubbed her temples like she was trying to physically massage away the migraine forming there. "Okay. Fine. Let's say I put on the dinosaur suit—which, by the way, smells like it died in 1987—how do we actually get the presents out without thirty kids and their parents noticing an entire dinosaur waddling off with a mountain of gift-wrapped boxes?" "Simple." Bill held up a finger like he was about to reveal the secret of the universe. "Piñata distraction." "A piñata distraction." "You smash the piñata, candy goes everywhere, every kid in that backyard turns into a feral raccoon fighting over Skittles. Total chaos. Nobody's watching the present table. That's when I make my move." "You. Mr. Clown. Casually strolling out with an armful of presents." "Exactly." "In front of forty parents." "They'll be distracted by the candy stampede!" "Bill. They will not be so distracted by a *piñata* that they don't notice a full-grown man in a clown suit stealing birthday gifts." Bill considered this for a moment, scratching his chin. "Okay, new plan. What if the dinosaur steals the presents instead?" "Why would that be less suspicious?" "Because dinosaurs are extinct, Molly. Nobody expects a dinosaur to commit grand theft." Molly stared at him for a long, silent moment, the kind of silence that usually preceded either violence or hysterical laughter. She chose the latter, collapsing onto the couch, wheezing. "We are going to prison," she finally said, wiping her eyes. "We're going to prison because of a dinosaur costume and a piñata." "Or," Bill said, holding up the clown wig triumphantly, "we're going to be rich, and mildly traumatize an entire neighborhood of children. Either way, it's a win." Molly grabbed the dinosaur suit, sighing the sigh of a woman who had clearly given up trying to have a normal life a long time ago. "Fine. But if I get recognized, I'm blaming you." "Deal."
created_at
"2026-07-01T07:45:00.821Z"
7/1/2026, 7:45:00 AM
model
"anthropic/claude-sonnet-5-20260630"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
4411
moderation_latency
(null)
generation_time
19107
tokens_prompt
77
tokens_completion
884
native_tokens_prompt
93
native_tokens_completion
1193
native_tokens_completion_images
(null)
native_tokens_reasoning
56
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.012116
router
(null)
provider_responses
0
endpoint_id
"37b16f45-eeae-45f3-9b85-2137f7c6d61b"
id
"msg_018d7fJ6drd5DanR8edRXxMt"
is_byok
false
latency
2952
model_permaslug
"anthropic/claude-sonnet-5-20260630"
provider_name
"Amazon Bedrock"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.17.0; linux; x64))"
http_referer
(null)
request_id
"req-1782891900-XnivwWdeZYtJFNvBYz8C"
session_id
(null)
preset_id
(null)
api_type
"completions"
id
"gen-1782891900-YP3ffrm7DnTO7PFghyKx"
upstream_id
"msg_018d7fJ6drd5DanR8edRXxMt"
total_cost
0.012116
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
response_cache_source_id
(null)
data_region
"global"
Evaluation details
Result
Evaluator
Details
Meta Data
43.38%
Matches word count
n/a
neededClean
false
words
517
9.23%
Dialogue to Total Word Ratio
Ratio: 63.10%, Deviation: 6.90%
neededClean
false
wordsTotal
523
wordsDialogue
330
26.3065%