NC Bench
Overview
About
Tests
Models
Vendors
Compare
Run: 2026-02-24-01KJ8TZA593PYHFFM0FWXEG4AY
Run Details
Model:
Claude Opus 4.6 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
22.9s
Cost:
$0.022010
Provider:
Amazon Bedrock
Evaluation summary
Score
Evaluator
Details Link
68.10%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
34.0524%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the hand-drawn map across the kitchen table, smoothing out a wrinkle that bisected the bouncy castle. Bill leaned forward, squinting through his reading glasses. "So the target is a seven-year-old's birthday party," Molly said. "Princess theme. Thirty-two confirmed guests. Present table is here, next to the face-painting station." "A princess theme," Bill repeated. "That's good. That's real good. Rich parents love a princess theme. We're talking high-end gifts. LEGOs. Maybe even a bicycle." "Maybe even two bicycles." "God, I love this job." Molly tapped the corner of the map. "Entry point is the backyard gate. Dad leaves it unlocked because he's too busy pretending he knows how to operate the smoker." "Classic dad move. What about the mom?" "The mom is our biggest threat. Sandra Whitfield. She organized a color-coded spreadsheet for the seating arrangements at a six-year-old's party, Bill. A spreadsheet. This woman does not miss details." Bill whistled low through his teeth. "So how do we neutralize Sandra?" "We don't have to. At exactly two-fifteen, the magician arrives. Sandra will spend twenty minutes arguing with him about whether his rabbit has been properly vaccinated. That's our window." "Twenty minutes feels tight." "It's plenty. You go in through the gate. I'll already be inside." "Wait, already inside how?" Molly paused and folded her arms. "I RSVP'd." "You RSVP'd to a child's birthday party." "As the mother of little Tristan." "Who the hell is Tristan?" "Tristan doesn't exist, Bill. That's the beauty of it. There are thirty-two kids. Nobody knows all thirty-two kids. I show up, say Tristan is shy and playing behind the azaleas, and nobody asks questions." Bill rubbed his chin, visibly impressed. "That is genuinely diabolical. What do I do once I'm through the gate?" "You'll be dressed as a balloon artist." "Oh, come on." "You said you wanted a more active role this time." "I meant like a lookout. Maybe a getaway driver. Not a balloon artist." "Can you make a balloon dog?" "No, Molly, I cannot make a balloon dog." "You've got three days to learn. YouTube it. While you're twisting balloons and keeping the kids distracted on the west lawn, I load the presents into the wagon." "What wagon?" "I'm bringing a little red wagon. Every mom at a kids' party has a little red wagon. It's practically camouflage." Bill stared at her for a long moment. "And then we just walk out the front with a wagon full of stolen birthday presents." "We walk out the back. Through the gate. Into the van." "What if a kid sees us?" "Bill, they're six and seven years old and they've been eating cake for forty-five minutes. They can barely see their own hands." He chewed his lip, then slowly nodded. "This is either the smartest plan we've ever had or we're going to prison over a LEGO set." "Probably both," Molly said, rolling up the map. "Wear comfortable shoes."
created_at
"2026-02-24T22:05:34.261Z"
2/24/2026, 10:05:34 PM
model
"anthropic/claude-4.6-opus-20260205"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1574
moderation_latency
151
generation_time
22695
tokens_prompt
77
tokens_completion
852
native_tokens_prompt
92
native_tokens_completion
862
native_tokens_completion_images
(null)
native_tokens_reasoning
66
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.02201
router
(null)
provider_responses
0
endpoint_id
"a8708fb9-08ea-43bb-a3b1-165c87ef09df"
id
"692e0003-c243-4e62-b038-260de831e5b7"
is_byok
false
latency
1544
model_permaslug
"anthropic/claude-4.6-opus-20260205"
provider_name
"Amazon Bedrock"
status
200
api_type
"completions"
id
"gen-1771970734-LgmCMyZtzxDBWaJtOVJH"
upstream_id
"692e0003-c243-4e62-b038-260de831e5b7"
total_cost
0.02201
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
Evaluation details
Result
Evaluator
Details
Meta Data
68.10%
Matches word count
n/a
neededClean
false
words
486
0.00%
Dialogue to Total Word Ratio
Ratio: 84.57%, Deviation: 14.57%
neededClean
false
wordsTotal
499
wordsDialogue
422
34.0524%