NC Bench
Overview
About
Tests
Models
Model
Claude Opus 4.6
Test
Dialogue tags
Scenario
Write 500 words with 70% dialogue
Run ID
2026-02-08-01KGYYBP41K6P6GWTYN8HR5NVT
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the hand-drawn map across the sticky table at Denny's, knocking over a syrup caddy in the process. "Alright, listen up. The party starts at two. We've got a narrow window between pass-the-parcel and the cake to make our move." Bill squinted at the map, rotating it twice before holding it upside down. "Is that a swimming pool or a dog?" "It's the bouncy castle, Bill. Focus. The bouncy castle is our biggest obstacle. It's positioned right between the back gate and the present table." "Why don't we just go around it?" "Because Mrs. Henderson's rose garden is on one side and she's got thorns that could puncture a tire. I did recon yesterday dressed as a mailwoman." "You don't look anything like a mailwoman." "And yet nobody questioned me. That's how good I am. Now, the presents will be stacked on a folding table near the patio. I counted at least fifteen kids on the Evite." "You hacked into an Evite?" "I didn't hack it, Bill. It was a public link on Facebook. Theresa Gladwell has zero sense of digital privacy. Anyway, fifteen kids means fifteen presents minimum. Some overachievers will bring two. We're looking at maybe twenty gifts." "What do you think we're pulling here? Value-wise?" "Could be huge. These are suburban parents. We're talking high-end Lego sets, maybe a couple of iPads if the grandparents came through. I saw one kid's dad pull up in a Tesla last week. That family alone could account for a hundred-dollar haul." Bill leaned back and crossed his arms. "A hundred dollars, Molly. We spent eighty on disguises." "The disguises are an investment. We'll reuse them." "I'm not wearing this clown suit again. The shoes are killing me and the wig smells like someone died in it." "Someone probably did. I got it from a estate sale. But the clown angle is genius. Nobody suspects the entertainment." "Everyone suspects clowns. Clowns are inherently suspicious." "Not at a children's party. At a children's party, a clown is invisible. You're wallpaper with a red nose." "What's your costume again?" "Princess Elsa." "You look nothing like Princess Elsa." "And you look nothing like a clown, and yet here we are. Now, I'll distract the kids with a sing-along while you load the presents into the laundry bags. I've timed the chorus of Let It Go and it gives you exactly three minutes and twelve seconds." "What if a kid sees me?" "Tell them you're Santa's helper doing early pickup. Kids are idiots, Bill. Trusting, beautiful idiots." Bill rubbed his temples and stared at the ceiling. "Last month we robbed a jewelry store. An actual jewelry store. With lasers." "And we got caught, Bill. We got caught because jewelry stores have security. You know what a children's party has? One overwhelmed mom and a labradoodle. This is the perfect crime." "This is the saddest crime." "Sad and perfect aren't mutually exclusive. Now put on your nose. We rehearse in ten."
created_at
"2026-02-08T15:37:03.359481+00:00"
Sunday, February 8th, 2026 at 3:37:03 PM GMT+00:00
model
"anthropic/claude-4.6-opus-20260205"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1577
moderation_latency
(null)
generation_time
23732
tokens_prompt
61
tokens_completion
657
native_tokens_prompt
75
native_tokens_completion
767
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.01955
router
(null)
provider_responses
0
id
"msg_01MK4F3fdXuth4DAychwTTiF"
status
200
is_byok
false
latency
1577
endpoint_id
"c926c048-563e-44b0-8434-95965bee924c"
provider_name
"Anthropic"
model_permaslug
"anthropic/claude-4.6-opus-20260205"
api_type
"completions"
id
"gen-1770564999-XgkoqFWFtwa6Iopy2fnn"
upstream_id
"msg_01MK4F3fdXuth4DAychwTTiF"
total_cost
0.01955
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Anthropic"
Evaluation details
Result
Evaluator
Details
Meta Data
99.9190%
Matches word count
n/a
neededClean
false
words
503
0.0000%
Dialogue to Total Word Ratio
Ratio: 90.26%, Deviation: 20.26%
neededClean
false
wordsTotal
503
wordsDialogue
454
49.9595%