NC Bench
Overview
About
Tests
Models
Model
Claude Opus 4.6
Test
Dialogue tags
Scenario
Write 500 words with 70% dialogue
Run ID
2026-02-08-01KGYY5FQ92B646XB28H3QGE0Q
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the hand-drawn map across the hood of the minivan, smoothing out the crayon marks with her palm. She'd stolen it right off the refrigerator during her reconnaissance mission, which had really just been dropping off a casserole and pretending to be from the neighborhood welcome committee. "Okay, the party starts at two," she said, tapping the crude drawing of a house with a dinosaur on the roof. "That gives us a window of exactly ninety minutes before the little monsters tear into the gift table." Bill squinted at the map. "Is that a dog or a security system?" "That's the birthday boy's self-portrait. Focus, Bill." "I am focused. I'm just saying, the last job you planned, we ended up locked in a Chuck E. Cheese bathroom for four hours." "That was a learning experience, and you know it." Molly pulled out a pair of walkie-talkies from her fanny pack. "Now, here's the plan. You go in as the entertainment." "Absolutely not." "You're going in as the clown, Bill." "I told you after the incident in Tucson, I am never putting on face paint again. That stuff doesn't come off. I looked like a sad mime for three weeks." "Fine, not a clown. A magician then. You've got the cape from when you went through your vampire phase." "It was not a phase. It was an aesthetic choice during a difficult period in my life." "Bill, do you want the presents or not?" He crossed his arms and leaned against the van. "How many kids are we talking?" "The invitation said the whole second-grade class. So roughly twenty-two miniature humans hopped up on cake and fruit punch." "Dear God." "Which is exactly why the distraction has to be big. You do the magic show in the backyard. I slip in through the side door, load up the wagon — yes, I'm bringing a little red wagon because it's thematically appropriate and also very practical — and I wheel the whole haul right out the front while every parent is filming their kid watching you pull a rabbit out of a hat." "I don't have a rabbit." "I bought you a rabbit." "When did you buy me a rabbit?" "His name is Gerald, and he's in the glove compartment." Bill opened the glove compartment. A small brown rabbit blinked at him, sitting on top of the vehicle registration. "Molly, there is a rabbit in the glove compartment." "I just said that. Keep up." "What if one of the kids recognizes me? Kids are perceptive. They're like tiny, sticky detectives." "You'll be wearing the cape and a top hat. You'll be unrecognizable." "What kind of presents are we even expecting here? These are seven-year-olds. We're going to end up with a wagon full of Legos and gift cards to Target." Molly smiled the way she always did right before Bill regretted everything. "Billy, do you have any idea what Legos go for on the secondary market?" He looked at Gerald. Gerald looked at him. "Get me the cape."
created_at
"2026-02-08T15:33:36.093351+00:00"
Sunday, February 8th, 2026 at 3:33:36 PM GMT+00:00
model
"anthropic/claude-4.6-opus-20260205"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1514
moderation_latency
157
generation_time
19386
tokens_prompt
61
tokens_completion
641
native_tokens_prompt
75
native_tokens_completion
736
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.018775
router
(null)
provider_responses
0
id
"917f0f3b-cd87-4a06-8658-a42208a26ea1"
status
200
is_byok
false
latency
1514
endpoint_id
"a8708fb9-08ea-43bb-a3b1-165c87ef09df"
provider_name
"Amazon Bedrock"
model_permaslug
"anthropic/claude-4.6-opus-20260205"
api_type
"completions"
id
"gen-1770564796-dIgDAwmhY6DHZBO7VLBX"
upstream_id
"917f0f3b-cd87-4a06-8658-a42208a26ea1"
total_cost
0.018775
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
Evaluation details
Result
Evaluator
Details
Meta Data
81.2727%
Matches word count
n/a
neededClean
false
words
512
31.5294%
Dialogue to Total Word Ratio
Ratio: 74.80%, Deviation: 4.80%
neededClean
false
wordsTotal
512
wordsDialogue
383
56.4010%