NC Bench
Overview
About
Tests
Models
Vendors
Compare
Run: 2026-02-24-01KJ8VNAV7JWT49M1YMF1HHGBN
Run Details
Model:
Claude Opus 4.6 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
20.9s
Cost:
$0.021110
Provider:
Amazon Bedrock
Evaluation summary
Score
Evaluator
Details Link
100.00%
Matches word count
(details)
0.23%
Dialogue to Total Word Ratio
(details)
50.1131%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the hand-drawn map across the hood of the van, smoothing the crayon wrinkles with her palm. Someone had drawn a dinosaur in the corner. She turned the map sideways and squinted. "Okay, so the bouncy castle is here," she said, tapping a purple blob. "That's our primary obstacle." Bill leaned over her shoulder and frowned. "That's a dog." "It's not a dog, Bill. I did recon. I sat on a bench across the street for two hours with binoculars." "That must have looked completely normal." "The point is," Molly continued, ignoring him, "the presents are being kept in the garage. The mother started stacking them there around noon. We're talking a significant haul. I counted at least thirty wrapped boxes." Bill pulled a celery stick from his jacket pocket and bit into it thoughtfully. He always ate celery when he was thinking. It was one of the many things about him that Molly found deeply irritating. "What kind of security are we dealing with?" he asked. "One father. He's on grill duty, so he'll be distracted by hot dogs. One mother, but she's running the show, so she'll be everywhere. That's a problem. We also have a grandmother who appears to be stationary near the gift table, but I believe she's asleep." "What about the kids?" Molly hesitated. She pulled a second sheet of paper from her pocket. This one had been folded into something that might have once been an airplane. "There are fourteen children confirmed. Ages five through eight. They are chaotic, unpredictable, and hopped up on sheet cake. We do not engage." Bill crunched his celery. "I once saw a five-year-old bite through a leather wallet." "Exactly. We avoid them entirely. Now here's the plan. At precisely three fifteen, the magician arrives." "There's a magician?" "There's always a magician, Bill. Every child within a fifty-foot radius will be drawn to the backyard. That gives us a window of approximately twelve minutes to access the garage through the side door, load the presents into the van, and disappear." Bill stared at the map for a long time. A bird landed on the van's side mirror and watched them both with what felt like judgment. "Molly, these are children's presents," he said slowly. "We're going to end up with a van full of Legos and craft kits." "Last year the Henderson kid got a PlayStation." Bill straightened up immediately. "What time did you say the magician arrives?" "Three fifteen. Sharp." "And the grandmother. You're sure she's asleep?" "She hasn't moved in four hours, Bill. She's either asleep or something much worse, and either way she's not a factor." Bill tossed the celery stick into a bush and cracked his knuckles with the gravity of a man preparing for the job of a lifetime. Molly folded the map and tucked it inside her jacket. "One more thing," Bill said. "If a kid sees us, what's the exit strategy?" Molly opened the van door and smiled. "We run."
created_at
"2026-02-24T22:17:35.856Z"
2/24/2026, 10:17:35 PM
model
"anthropic/claude-4.6-opus-20260205"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1360
moderation_latency
143
generation_time
20664
tokens_prompt
77
tokens_completion
844
native_tokens_prompt
92
native_tokens_completion
826
native_tokens_completion_images
(null)
native_tokens_reasoning
75
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.02111
router
(null)
provider_responses
0
endpoint_id
"a8708fb9-08ea-43bb-a3b1-165c87ef09df"
id
"f122f3c2-e63e-4287-be8a-ddbf2df61ec4"
is_byok
false
latency
1324
model_permaslug
"anthropic/claude-4.6-opus-20260205"
provider_name
"Amazon Bedrock"
status
200
api_type
"completions"
id
"gen-1771971455-lgS1JLGayecLLRyfVFu1"
upstream_id
"f122f3c2-e63e-4287-be8a-ddbf2df61ec4"
total_cost
0.02111
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Matches word count
n/a
neededClean
false
words
499
0.23%
Dialogue to Total Word Ratio
Ratio: 61.03%, Deviation: 11.03%
neededClean
false
wordsTotal
503
wordsDialogue
307
50.1131%