NC Bench
Overview
About
Tests
Models
Vendors
Compare
Run: 2026-02-24-01KJ8TZ0VM2T910AF4BCKMVEWZ
Run Details
Model:
Claude Opus 4.6 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
41.4s
Cost:
$0.075585
Provider:
Amazon Bedrock
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
95.03%
Dialogue to Total Word Ratio
(details)
47.5132%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the hand-drawn map across the hood of their dented minivan, smoothing out the crayon smudges with her thumb. The blueprint, if you could call it that, had been lifted from the birthday boy's backpack during a daring reconnaissance mission at the local playground. "Right," she said, tapping the map with authority. "The party starts at two. We go in at two-fifteen, during the chaos of musical chairs." Bill squinted at the map and turned it upside down. "Is that a dog or a security guard?" "That's a bouncy castle, Bill." "Of course it is." He nodded slowly, as though this revelation clarified everything. It clarified nothing. "And we're sure about this? Robbing a seven-year-old's birthday party?" "We're not robbing a seven-year-old. We're robbing approximately thirty seven-year-olds. There's a significant difference in terms of haul." Molly pulled a notebook from her jacket pocket and flipped it open. "Based on my research, the average gift at an upper-middle-class birthday party is valued at forty-five dollars. Multiply that by thirty kids, and we're looking at over thirteen hundred dollars in merchandise." "Merchandise," Bill repeated. "You mean Legos and slime kits." "Legos have incredible resale value. Do your homework." Bill leaned against the van and crossed his arms. The van groaned under even this modest pressure. He stared across the suburban street at the Henderson house, where a woman in yoga pants was already tying balloons to the mailbox. "What about the parents?" he asked. "There's going to be parents everywhere." Molly smiled the way she always smiled before explaining the part of the plan she was most proud of. "That's the beauty of it. Parents at children's parties are completely checked out. They're on their phones. They're drinking bad coffee from paper cups. They're dead behind the eyes, Bill. We could wheel a dolly through the living room and they'd assume we were with the entertainment." "Speaking of which," Bill said, reaching into the back of the van. He held up two matching purple shirts with the words BALLOON BONANZA stitched across the chest in wobbly iron-on letters. Molly had spent an entire evening crafting those shirts. She was not a skilled seamstress, but she was a skilled liar, and confidence covered a multitude of crooked lettering. "We walk in as the balloon artists," she said. "We set up in the corner near the gift table. I twist a few poodles, you keep lookout. At two-thirty, when they wheel out the cake, every single eyeball in that house turns toward the kitchen. That gives us a ninety-second window to load the presents into our bags and slip out the back door." Bill pulled the purple shirt over his head. It was at least two sizes too small, stretching across his barrel chest like a sausage casing. "One problem," he said. "I don't know how to make balloon animals." Molly stared at him. "You had one job this week, Bill." "I learned the sword. Kids love swords." She pinched the bridge of her nose, breathed deeply, and accepted her fate. "Fine. Swords it is. But if a single child asks you for a giraffe, we're blown." "No giraffes," Bill agreed solemnly. "I'll tell them giraffes aren't real." Molly folded the map, tucked it into her pocket, and climbed into the van. The engine coughed to life like a lifelong smoker on a winter morning. They had two hours, a questionable plan, and absolutely no balloon giraffes. It would have to be enough.
created_at
"2026-02-24T22:05:24.741Z"
2/24/2026, 10:05:24 PM
model
"anthropic/claude-4.6-opus-20260205"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1747
moderation_latency
263
generation_time
40927
tokens_prompt
77
tokens_completion
1912
native_tokens_prompt
92
native_tokens_completion
3005
native_tokens_completion_images
(null)
native_tokens_reasoning
1013
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.075585
router
(null)
provider_responses
0
endpoint_id
"a8708fb9-08ea-43bb-a3b1-165c87ef09df"
id
"1982df3b-97ed-4f7e-ae09-1ed8f3341c19"
is_byok
false
latency
1715
model_permaslug
"anthropic/claude-4.6-opus-20260205"
provider_name
"Amazon Bedrock"
status
200
api_type
"completions"
id
"gen-1771970724-pICZbiGjF3DTUd8jhiuQ"
upstream_id
"1982df3b-97ed-4f7e-ae09-1ed8f3341c19"
total_cost
0.075585
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
580
95.03%
Dialogue to Total Word Ratio
Ratio: 48.99%, Deviation: 1.01%
neededClean
false
wordsTotal
594
wordsDialogue
291
47.5132%