NC Bench
Overview
About
Tests
Models
Vendors
Compare
Run: 2026-02-24-01KJ8TN930JFZV1NKEJ3DG6FDT
Run Details
Model:
Claude Opus 4.6 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
3m 33s
Cost:
$0.583060
Provider:
Amazon Bedrock
Evaluation summary
Score
Evaluator
Details Link
100.00%
Matches word count
(details)
98.78%
Dialogue to Total Word Ratio
(details)
99.3911%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the hand-drawn blueprint across the hood of their rusted Buick, pinning the corners with a half-eaten burrito and three loose cigarettes. The blueprint, rendered in purple crayon on the back of a Denny's placemat, depicted the Hendersons' backyard in meticulous detail. There was the bouncy castle on the east flank. The folding table of juice boxes along the southern perimeter. And there, marked with a crude skull and crossbones, the present table. "Thirty-seven kids," Molly said, tapping the skull. "That means thirty-seven presents, minimum. Some of those little overachievers will bring two. We're looking at Legos, action figures, maybe a bike from the grandparents. Conservative street value, four hundred dollars." Bill leaned over the map, squinting. He was a large man who looked like someone had inflated a regular man with a bicycle pump and forgotten to stop. His balaclava was already pulled on, though the job wasn't until Saturday and it was currently Tuesday afternoon in a Wendy's parking lot. "Four hundred bucks," Bill said. "Molly, we spent three hundred on surveillance equipment. We'll clear a hundred dollars. That's fifty each." "You have to spend money to steal money, Bill. That's the first rule of thieving. Also, I'm taking sixty percent." It was not the first rule of thieving. Bill knew this because he had attended a weekend seminar on professional burglary in Tucson, and the first rule had been about comfortable footwear. But he had learned long ago not to argue with Molly while she was in planning mode. Her left eye did a terrible thing. Molly produced a second document from her jacket pocket, typed and laminated, because she was nothing if not thorough. The timeline broke into four phases. Phase one: infiltration, disguised as a clown duo called Bonkers and Sprinkles. Phase two: distraction, during which Bill would perform a balloon animal routine of sufficient complexity to captivate every child and parent. Phase three: extraction, with Molly loading gifts into a modified Radio Flyer wagon and hauling them through a gap in the fence. Phase four simply read: drive like hell. "I'm not being Sprinkles," Bill said. "You're absolutely being Sprinkles." "Why can't I be Bonkers?" "Because Bonkers is the brains. Bonkers runs the exit strategy. Sprinkles makes balloon dogs and falls down a lot. It's physical comedy, Bill. You were born for this." Bill stared at the placemat and considered the choices that had delivered him to this moment. He had robbed a diamond exchange in Antwerp. He had driven a speedboat through Venice with fourteen million in bearer bonds duct-taped to his chest. Now he was arguing about clown names in a parking lot, planning to steal birthday presents from a child named Kayden. "Fine," he said. "But I want the bigger nose. And I'm not doing the unicorn balloon. Last time I nearly passed out." "Deal. But if any kid asks for a unicorn, you smile and you twist." They shook on it, solemnly, as though they were men entering war.
created_at
"2026-02-24T22:00:05.483Z"
2/24/2026, 10:00:05 PM
model
"anthropic/claude-4.6-opus-20260205"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1465
moderation_latency
255
generation_time
213057
tokens_prompt
77
tokens_completion
3068
native_tokens_prompt
92
native_tokens_completion
23304
native_tokens_completion_images
(null)
native_tokens_reasoning
2286
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.58306
router
(null)
provider_responses
0
endpoint_id
"a8708fb9-08ea-43bb-a3b1-165c87ef09df"
id
"19dcc015-a7f9-4538-89da-a52e95addeb4"
is_byok
false
latency
1431
model_permaslug
"anthropic/claude-4.6-opus-20260205"
provider_name
"Amazon Bedrock"
status
200
api_type
"completions"
id
"gen-1771970405-rNnHJhQkHBB4fFurhLiV"
upstream_id
"19dcc015-a7f9-4538-89da-a52e95addeb4"
total_cost
0.58306
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Matches word count
n/a
neededClean
false
words
500
98.78%
Dialogue to Total Word Ratio
Ratio: 30.50%, Deviation: 0.50%
neededClean
false
wordsTotal
505
wordsDialogue
154
99.3911%