NC Bench
Overview
About
Tests
Models
Vendors
Compare
Run: 2026-02-24-01KJ8WJH3WZ2WS0S24WWMYSSNJ
Run Details
Model:
Claude Opus 4.6 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
23.1s
Cost:
$0.022360
Provider:
Amazon Bedrock
Evaluation summary
Score
Evaluator
Details Link
43.38%
Matches word count
(details)
0.08%
Dialogue to Total Word Ratio
(details)
21.7284%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the hand-drawn map across the hood of their dented minivan. It was covered in crayon, because she had stolen it from a child, which she felt demonstrated excellent commitment to the theme. "Alright, Bill, listen up. The party starts at two. That gives us a forty-five minute window while every kid in the neighborhood is distracted by a clown named Mr. Bonkers." "I hate clowns," Bill said. "Everyone hates clowns. That's not the point. The point is that the presents will be stacked on a folding table in the garage. I counted fourteen kids on the invite list. Fourteen kids means fourteen presents, Bill. Minimum." "What if some of them are cheap parents who just throw a ten dollar bill in a card?" "Then we take the ten dollar bills, Bill. We're not philosophers. We're thieves." Bill scratched the back of his neck and squinted at the map. "Why is there a dragon drawn next to the garage?" "That's the family dog. A golden retriever named Sergeant Pickles." "You're telling me we have to get past a dog named Sergeant Pickles." "He's old. He's fat. He sleeps nineteen hours a day. I've done surveillance." "You did surveillance on a dog?" "Three days of it. I sat in this van with binoculars and a notebook. Sergeant Pickles is not a threat. He is, at best, an inconvenience." Bill folded his arms. "And what about the parents? The mom looked kind of intense when I drove past. She had one of those big planners with color-coded tabs." "Color-coded tab moms are the most predictable moms, Bill. She will be inside managing the cake situation and making sure nobody has an allergic reaction to face paint. She will not be guarding the garage." "What's our exit strategy?" "We load the presents into the van, drive four blocks to the storage unit, and sort through everything. Whatever we don't want, we donate." "We donate stolen children's birthday presents?" "We're criminals, Bill, not monsters. Some of those gifts are going to be terrible. Nobody needs another set of bath crayons." "I liked bath crayons as a kid." "And look how you turned out." Bill considered this for a long moment and apparently found no worthy rebuttal. "Fine. What do I wear?" "I got us matching polos and khaki shorts. We're going in disguised as party dads. You carry a paper plate at all times and look vaguely exhausted. Nobody questions a man holding a paper plate at a children's party." "That's actually kind of brilliant." "I know it is. I've been planning this for two weeks. I even bought us name tags. You're Steve. I'm Karen." "Why do you get to be Karen?" "Because Karen walks into any room and immediately takes charge and nobody dares stop her. Now get in the van, Steve." Bill climbed into the passenger seat and buckled his seatbelt, because he was a criminal who still believed in automotive safety. "This is either going to be genius," he said, "or the most embarrassing arrest of our careers." "Both," Molly said, turning the key. "Definitely both."
created_at
"2026-02-24T22:33:32.548Z"
2/24/2026, 10:33:32 PM
model
"anthropic/claude-4.6-opus-20260205"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1546
moderation_latency
209
generation_time
22814
tokens_prompt
77
tokens_completion
898
native_tokens_prompt
92
native_tokens_completion
876
native_tokens_completion_images
(null)
native_tokens_reasoning
104
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.02236
router
(null)
provider_responses
0
endpoint_id
"a8708fb9-08ea-43bb-a3b1-165c87ef09df"
id
"ffbf3322-8803-423f-a70a-ab56af36aab2"
is_byok
false
latency
1521
model_permaslug
"anthropic/claude-4.6-opus-20260205"
provider_name
"Amazon Bedrock"
status
200
api_type
"completions"
id
"gen-1771972412-EE4SYTA2fDFEZZdwWlzK"
upstream_id
"ffbf3322-8803-423f-a70a-ab56af36aab2"
total_cost
0.02236
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
Evaluation details
Result
Evaluator
Details
Meta Data
43.38%
Matches word count
n/a
neededClean
false
words
517
0.08%
Dialogue to Total Word Ratio
Ratio: 81.96%, Deviation: 11.96%
neededClean
false
wordsTotal
521
wordsDialogue
427
21.7284%