NC Bench
Overview
About
Tests
Models
Vendors
Compare
Run: 2026-02-24-01KJ8VDY6E943Y4P8FP81WVWNF
Run Details
Model:
Claude Opus 4.6 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
23.5s
Cost:
$0.021185
Provider:
Amazon Bedrock
Evaluation summary
Score
Evaluator
Details Link
99.74%
Matches word count
(details)
11.33%
Dialogue to Total Word Ratio
(details)
55.5356%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the hand-drawn map across the hood of the van, smoothing out a crayon smudge with her thumb. Bill leaned in, squinting. "So the target is here," Molly said, tapping a rectangle labeled BOUNCY CASTLE in wobbly letters. "Mrs. Henderson's backyard. The party kicks off at two. We go in at two fifteen." "Why two fifteen?" Bill asked. "Because that's when the magician arrives. Every kid in that yard is going to be staring at a man pulling scarves out of his sleeves. Nobody's watching the gift table." "And you know this how?" "I've been conducting surveillance for three days, Bill. I sat in a tree with binoculars. I watched the Henderson woman carry in a sheet cake the size of a coffin. I have committed to this." Bill scratched the back of his neck. "I just want to say, for the record, that robbing a children's birthday party feels like a new low for us." "Lower than the time we stole that fundraiser thermometer from the church lawn?" "That was different. Nobody cried." "Nobody's going to cry this time either. Kids are resilient. They bounce back. That's what the bouncy castle is for." "That doesn't even make sense." "It doesn't have to make sense, Bill. It has to make money. Do you know what parents spend on gifts these days? We're talking high-end electronics, brand-name sneakers, gift cards with actual balances on them. This is a wealthy neighborhood. Little Aiden Henderson is turning eight, and his parents drive matching Range Rovers." "His name is Aiden?" "Of course his name is Aiden." Bill studied the map again. "What's this here, the big circle?" "That's the above-ground pool. Stay away from it. The filtration system sounds like a jet engine. It'll cover noise for us, but if you fall in you'll blow the whole operation. And you can't swim." "I can swim." "Bill, I watched you panic in a puddle." "It was a deep puddle." Molly pulled two balaclavas from her jacket and handed one to Bill. They were bright purple with yellow stars on them. "What are these?" he said, holding his at arm's length. "Disguises. I got them from the party supply store. We blend in with the theme." "The theme is outer space?" "The theme is magic and wonder, Bill. Just put it on." He stretched the balaclava over his head and looked at his reflection in the van's side mirror. He resembled a sad constellation. "I look ridiculous." "You look like a professional. Now, I go over the fence on the east side. You come through the garage. We converge on the gift table, load everything into these." She produced two large sacks, also covered in stars. "We're out in ninety seconds. Any questions?" "What if the magician sees us?" "Then we walk fast and don't make eye contact. He's being paid twelve dollars an hour. He's not going to be a hero." Bill sighed, adjusted his balaclava, and started the van.
created_at
"2026-02-24T22:13:33.528Z"
2/24/2026, 10:13:33 PM
model
"anthropic/claude-4.6-opus-20260205"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1396
moderation_latency
223
generation_time
23209
tokens_prompt
77
tokens_completion
831
native_tokens_prompt
92
native_tokens_completion
829
native_tokens_completion_images
(null)
native_tokens_reasoning
73
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.021185
router
(null)
provider_responses
0
endpoint_id
"a8708fb9-08ea-43bb-a3b1-165c87ef09df"
id
"9153005e-e88e-4a49-a515-b604b1097e41"
is_byok
false
latency
1356
model_permaslug
"anthropic/claude-4.6-opus-20260205"
provider_name
"Amazon Bedrock"
status
200
api_type
"completions"
id
"gen-1771971213-F30PlAfVGOgrGbgB6rRc"
upstream_id
"9153005e-e88e-4a49-a515-b604b1097e41"
total_cost
0.021185
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
Evaluation details
Result
Evaluator
Details
Meta Data
99.74%
Matches word count
n/a
neededClean
false
words
496
11.33%
Dialogue to Total Word Ratio
Ratio: 76.60%, Deviation: 6.60%
neededClean
false
wordsTotal
500
wordsDialogue
383
55.5356%