NC Bench
Overview
About
Tests
Models
Model
Claude Opus 4.6
Test
Dialogue tags
Scenario
Write 500 words with 70% dialogue
Run ID
2026-02-08-01KGYZVRNQM5HFNVXV86FYXJ0Y
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the hand-drawn map across the sticky diner table and tapped her finger on a crudely sketched rectangle labeled BOUNCE HOUSE in red crayon. "Alright, Bill, listen up. The party starts at two. That gives us a forty-five minute window while the clown is doing balloon animals in the backyard. Every kid, every parent, every sticky little gremlin will be out there watching a grown man twist a dachshund out of latex." Bill squinted at the map. "Why is everything drawn in crayon?" "Because I stole it from the kid's desk. Focus, Bill." "I'm focused. I just think it's weird that you broke into a child's bedroom to steal a map you then drew yourself in his own crayon." "Reconnaissance is reconnaissance. Now, the presents will be stacked on a folding table in the dining room. I counted at least fifteen RSVPs on the fridge." "You were in the kitchen too?" "I was thorough. Do you want to hear the plan or do you want to audit my methods?" Bill leaned back and folded his arms. "Fine. Walk me through it." "You go in through the side gate dressed as a pizza delivery guy. I already ordered six large pepperonis to arrive at two fifteen, so you'll blend right in." "Where do I get a pizza uniform?" "You don't need a uniform. Nobody questions a man holding pizza. It's one of the fundamental laws of the universe." "That's not a law." "Name one time you've ever questioned a man holding pizza." Bill opened his mouth, then closed it. "Okay. Continue." "You walk the pizzas into the kitchen. While everyone's distracted fighting over slices, I come in through the garage. I load the presents into the trunk. We rendezvous at the getaway car and we're out before anyone notices." "What getaway car?" "Your mother's Subaru." "Absolutely not. Last time we used my mother's car she found a ski mask in the glove compartment and I had to tell her I'd joined a winter sports league." "Does she believe you're athletic?" "She does not, which is why she's still asking questions. We're using your car." "My car doesn't have a trunk. It's a hatchback." "Then we fold down the seats." "And where do I sit?" "On the presents, Bill. You sit on the presents like a dragon on a hoard. That's literally your only job in the second half of this operation." Bill stared at the crayon map for a long moment. A waitress refilled their coffees without making eye contact, as though she had already decided this was not her problem. "What kind of presents are we even talking about?" Bill finally asked. "These are eight-year-olds. It's going to be Legos and gift cards to Target." "Exactly. Do you know what Legos go for on the secondary market? We're looking at a five-hundred-dollar haul minimum." "We're monsters." "We're entrepreneurs. Now eat your pancakes. We've got a party to crash."
created_at
"2026-02-08T16:03:14.342663+00:00"
Sunday, February 8th, 2026 at 4:03:14 PM GMT+00:00
model
"anthropic/claude-4.6-opus-20260205"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1248
moderation_latency
182
generation_time
18609
tokens_prompt
61
tokens_completion
636
native_tokens_prompt
75
native_tokens_completion
725
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0185
router
(null)
provider_responses
0
id
"edeec9c9-7332-4f24-8c04-b8b44f40e7d6"
status
200
is_byok
false
latency
1244
endpoint_id
"a8708fb9-08ea-43bb-a3b1-165c87ef09df"
provider_name
"Amazon Bedrock"
model_permaslug
"anthropic/claude-4.6-opus-20260205"
api_type
"completions"
id
"gen-1770566574-R9eZiMHjDg9s0dZDMCfw"
upstream_id
"edeec9c9-7332-4f24-8c04-b8b44f40e7d6"
total_cost
0.0185
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
Evaluation details
Result
Evaluator
Details
Meta Data
99.3769%
Matches word count
n/a
neededClean
false
words
495
0.0039%
Dialogue to Total Word Ratio
Ratio: 84.24%, Deviation: 14.24%
neededClean
false
wordsTotal
495
wordsDialogue
417
49.6904%