NC Bench
Overview
About
Tests
Models
Vendors
Compare
Run: 2026-02-24-01KJ8VWQXQ73T1GTC9789FMFBR
Run Details
Model:
Claude Opus 4.6 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
22.0s
Cost:
$0.020735
Provider:
Amazon Bedrock
Evaluation summary
Score
Evaluator
Details Link
95.99%
Matches word count
(details)
76.95%
Dialogue to Total Word Ratio
(details)
86.4685%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the hand-drawn map across the hood of the van, smoothing out a wrinkle that cut right through what appeared to be a bouncy castle. Bill leaned over her shoulder, chewing a toothpick with the intensity of a man reviewing battlefield coordinates. "Alright, the party starts at two," Molly said, tapping the map. "That gives us a narrow window between the arrival of the presents and the arrival of the children. We go in at one forty-five." "One forty-five," Bill repeated. "During broad daylight. At a house full of balloons. In a residential neighborhood where every parent on the street has the situational awareness of a Secret Service agent." "That's why we have disguises." "Molly, last time you said that, I ended up dressed as a nun at a dog show." "And nobody suspected you." "A dog bit me." "That's not suspicion. That's affection." She pulled two folded costumes from the back of the van and held one up. Bill stared at it for a long time. "That's a clown suit." "Entertainment-based infiltration. We show up looking like the hired act, walk right through the front door, and locate the present table. According to my source, it'll be in the dining room, east side of the house, under a banner that says Happy Birthday Caden." "Who is your source?" "Facebook. The mother posted the entire floor plan in her party planning group. She also posted that she ordered a two-tier dinosaur cake, which is irrelevant but I respect her commitment." Bill held the clown suit against his body and frowned. "This is a medium." "You'll stretch it out." "I'm six foot three, Molly." "Then you'll be a tall clown. Tall clowns are funnier. Now listen, once we're inside, you distract whatever adults are setting up. Do some juggling, make a balloon animal, I don't care. I'll load the presents into these." She produced two enormous Santa sacks from behind the passenger seat. "Santa sacks. In July." "Nobody questions a sack. It's a universal carrying device. Now, estimated haul based on the guest list is between twenty and twenty-five wrapped presents. We're looking at Legos, action figures, probably a couple of gift cards from the lazy parents. Total street value, maybe eight hundred dollars." "We're risking arrest for eight hundred dollars worth of children's toys." "We're risking arrest for the principle of a clean job. Also I already promised my nephew a Lego set and I am not paying retail." Bill rubbed his face with both hands and looked up at the sky as though expecting divine intervention. None came. A bird defecated on the van's windshield with what felt like editorial commentary. "Fine," he said. "But I'm not making balloon animals." "You practiced all week." "The giraffe keeps looking like something inappropriate." Molly zipped up her clown suit and grinned. "Then make dogs. Everyone loves dogs." "That nun didn't." "Forget the nun, Bill. Focus. We've got a birthday party to ruin."
created_at
"2026-02-24T22:21:38.671Z"
2/24/2026, 10:21:38 PM
model
"anthropic/claude-4.6-opus-20260205"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1429
moderation_latency
151
generation_time
21735
tokens_prompt
77
tokens_completion
839
native_tokens_prompt
92
native_tokens_completion
811
native_tokens_completion_images
(null)
native_tokens_reasoning
62
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.020735
router
(null)
provider_responses
0
endpoint_id
"a8708fb9-08ea-43bb-a3b1-165c87ef09df"
id
"52c77f3a-bd10-435f-8873-c639156b687f"
is_byok
false
latency
1394
model_permaslug
"anthropic/claude-4.6-opus-20260205"
provider_name
"Amazon Bedrock"
status
200
api_type
"completions"
id
"gen-1771971698-h55sE2SDle6qZRMHa9c4"
upstream_id
"52c77f3a-bd10-435f-8873-c639156b687f"
total_cost
0.020735
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
Evaluation details
Result
Evaluator
Details
Meta Data
95.99%
Matches word count
n/a
neededClean
false
words
492
76.95%
Dialogue to Total Word Ratio
Ratio: 72.29%, Deviation: 2.29%
neededClean
false
wordsTotal
498
wordsDialogue
360
86.4685%