NC Bench
Overview
About
Tests
Models
Vendors
Compare
Run: 2026-02-24-01KJ8TSAZV7NPY1B224WP0MPC7
Run Details
Model:
Claude Opus 4.6 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
25.0s
Cost:
$0.020085
Provider:
Amazon Bedrock
Evaluation summary
Score
Evaluator
Details Link
99.74%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
49.8741%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the hand-drawn map across the hood of the van, smoothing out the crayon marks with her palm. She had lifted it from the birthday boy's backpack two days earlier, and it was, without question, the worst piece of intelligence she had ever worked from. "Is that a dog or a security guard?" Bill asked, pointing at a brown blob near what appeared to be the front door. "It's a pony," Molly said. "They hired a pony for rides. Which actually works in our favor because every kid at that party is going to be in the backyard losing their minds over it." Bill rubbed his chin and nodded slowly, as though she had just outlined the invasion of Normandy. He was wearing all black, which would have been more appropriate if the job were happening at midnight instead of two in the afternoon. "So we go in the front while the pony creates a diversion," he said. "The pony isn't creating a diversion. The pony is just being a pony. We're exploiting the natural gravitational pull of a pony on children under ten." "Right. And the presents?" "They'll be on a table in the living room. Every parent drops them off inside because nobody wants cake frosting on wrapped gifts. I confirmed this with three separate mothers at school pickup. They thought I was making conversation." "You're a genius," Bill said. "I know. Now the tricky part is the grandmother. She stays inside during parties because the noise gives her a headache. She parks herself on the couch with a glass of wine and watches game shows." Bill's eyes went wide. "There's a grandmother? You didn't mention a grandmother." "I'm mentioning her now. She's seventy-four, partially deaf, and on her second glass of merlot by one o'clock. She is not a threat." "Grandmothers are always a threat, Molly. My grandmother once chased a raccoon out of her garage with a tennis racket." "Was the raccoon trying to steal a pile of presents wrapped in Paw Patrol paper?" "No, but the principle stands." Molly folded the map and tucked it into her jacket. She checked her watch and then looked at Bill with the steady, humorless expression she always wore before a job, which was unfortunate because she had the face of a kindergarten teacher and it didn't land the way she wanted it to. "We walk in carrying a large gift bag. Nobody questions someone arriving at a children's party with a gift. We load the presents into the bag, walk out, and drive away. Total time inside, four minutes." "What's in the gift bag when we walk in?" "A smaller, folded gift bag. For overflow." Bill stared at her. "That's actually brilliant." "Stop sounding surprised. Now put on a normal shirt. You look like you're about to rob a bank." "We are robbing something." "We're robbing children, Bill. There's a dress code. Look approachable." He sighed and climbed into the van to change.
created_at
"2026-02-24T22:02:18.5Z"
2/24/2026, 10:02:18 PM
model
"anthropic/claude-4.6-opus-20260205"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1581
moderation_latency
188
generation_time
24707
tokens_prompt
77
tokens_completion
820
native_tokens_prompt
92
native_tokens_completion
785
native_tokens_completion_images
(null)
native_tokens_reasoning
73
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.020085
router
(null)
provider_responses
0
endpoint_id
"a8708fb9-08ea-43bb-a3b1-165c87ef09df"
id
"df0758a5-8df4-4e38-886c-46a8c4846fbd"
is_byok
false
latency
1486
model_permaslug
"anthropic/claude-4.6-opus-20260205"
provider_name
"Amazon Bedrock"
status
200
api_type
"completions"
id
"gen-1771970538-iEFc6bH1gz0WxbsaBByy"
upstream_id
"df0758a5-8df4-4e38-886c-46a8c4846fbd"
total_cost
0.020085
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
Evaluation details
Result
Evaluator
Details
Meta Data
99.74%
Matches word count
n/a
neededClean
false
words
496
0.00%
Dialogue to Total Word Ratio
Ratio: 64.26%, Deviation: 14.26%
neededClean
false
wordsTotal
498
wordsDialogue
320
49.8741%