NC Bench
Overview
About
Tests
Models
Model
Claude Opus 4.6
Test
Dialogue tags
Scenario
Write 500 words with 50% dialogue
Run ID
2026-02-08-01KGYYZWWH6BG7WZP93DET3AWM
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the hand-drawn map across the hood of the van, using a half-eaten corn dog and a stapler to pin down the corners. The crayon markings were meticulous, if somewhat artistic. She had used glitter glue for the entry points. "The party starts at two," she said, tapping a purple star near what appeared to be a bouncy castle. "That gives us exactly one hour before the sugar crash hits and the parents start loading kids into minivans." Bill squinted at the map, turning his head sideways. "Is that a dog or a security guard?" "That's Mrs. Henderson. She runs the HOA. Basically the same thing." "And what's this rectangle with teeth?" "The bouncy castle, Bill. Focus." Bill straightened up and crossed his arms. He was wearing all black, which would have been more impressive if it weren't ninety degrees outside and his balaclava wasn't rolled up on his head like a beanie. Sweat trickled down his temple and into his ear. "Walk me through the present situation again," he said. Molly pulled out a folded piece of notebook paper from her back pocket. She had intercepted the party invitation from a mailbox three streets over, which she maintained was reconnaissance and not a federal crime. "The birthday boy is turning seven. His name is Caleb. Based on the guest list, we're looking at approximately twenty-two gifts, conservatively wrapped. Caleb's parents are loaded, Bill. We're talking the good stuff. LEGOs. Remote control cars. Maybe even one of those tablets with the rubber cases." "I do need a new tablet," Bill admitted. "Exactly. Now here's the plan. You go in through the back gate dressed as the entertainment." Bill's face fell. "What entertainment?" "I hired a clown for the party three weeks ago and then canceled him this morning." "Molly, I'm not going in as a clown." "You're going in as a clown." "I have a documented fear of clowns." "Which means you'll be very convincing. Method acting, Bill." He rubbed his face with both hands and let out a long breath that sounded like a balloon deflating, which Molly felt was already very on theme. "Fine. So I distract the kids. Then what?" "Then I back the van up to the gift table during the chaos, load everything in, and we're gone before anyone notices." "What about Mrs. Henderson?" Molly smiled and reached into the van, producing a Tupperware container. She popped the lid to reveal a dozen brownies arranged in a neat grid. "Loaded with espresso. One bite and she'll be reorganizing her entire garage instead of watching the yard." Bill picked up a brownie, sniffed it, and put it back. "This is either the dumbest thing we've ever done or the greatest." "Those two things have never been mutually exclusive for us." He couldn't argue with that. He never could. He pulled the balaclava down over his face, realized he couldn't breathe, and pulled it back up. "Get the clown nose," he said. "Let's do this."
created_at
"2026-02-08T15:48:03.519906+00:00"
Sunday, February 8th, 2026 at 3:48:03 PM GMT+00:00
model
"anthropic/claude-4.6-opus-20260205"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1706
moderation_latency
(null)
generation_time
21159
tokens_prompt
61
tokens_completion
647
native_tokens_prompt
75
native_tokens_completion
736
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.018775
router
(null)
provider_responses
0
id
"msg_019otniN4D9TwSvvvLhudrxT"
status
200
is_byok
false
latency
1706
endpoint_id
"c926c048-563e-44b0-8434-95965bee924c"
provider_name
"Anthropic"
model_permaslug
"anthropic/claude-4.6-opus-20260205"
api_type
"completions"
id
"gen-1770565661-cf6vFIYuOg4pR4bmhgC0"
upstream_id
"msg_019otniN4D9TwSvvvLhudrxT"
total_cost
0.018775
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Anthropic"
Evaluation details
Result
Evaluator
Details
Meta Data
99.9840%
Matches word count
n/a
neededClean
false
words
502
88.0749%
Dialogue to Total Word Ratio
Ratio: 51.59%, Deviation: 1.59%
neededClean
false
wordsTotal
502
wordsDialogue
259
94.0295%