NC Bench
Overview
About
Tests
Models
Model
Claude Opus 4.6
Test
Dialogue tags
Scenario
Write 500 words with 70% dialogue
Run ID
2026-02-08-01KGZ055878FJ2FMV46SXRH563
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the hand-drawn map across the sticky diner table, knocking over the ketchup bottle for the third time. Bill caught it without looking up from his phone. "Put that thing away," Molly hissed. "We need to focus. The party starts at two. That gives us exactly one hour to get in, grab the presents, and get out before the clown arrives." "There's a clown?" Bill set his phone down. "Nobody said anything about a clown." "Of course there's a clown, Bill. It's a seven-year-old's birthday party. There's always a clown." "I don't work with clowns, Molly. You know that. Not after Reno." "Reno was different. That clown was an undercover cop." "They're all undercover cops as far as I'm concerned. Those big shoes, those little cars. It's surveillance, Molly. Wake up." She rubbed her temples and pointed to a square she'd drawn in blue crayon. "This is the bounce house. It'll be set up along the east side of the backyard. That's our entry point." "We're going through a bounce house?" "Over it. We scale the bounce house, drop down behind the gift table, load everything into the bags, and exit through the side gate." Bill squinted at the map. "Why is everything drawn in crayon?" "It was all I had. My pen exploded in the car." "And what's this brown blob?" "That's you." "I look like a potato." "Stay focused. Now, the mother is the real threat. Karen Henderson. She runs the PTA like a paramilitary operation. If she spots us, it's over. She will not hesitate to call the authorities and post about it on three separate Facebook groups." Bill shuddered. "What about the kids?" "What about them?" "Won't they, I don't know, notice two adults sprinting across their party with armfuls of presents?" "That's where the distraction comes in. I hired a second clown." "You just said there's already a clown!" "Right. So when two clowns show up, chaos. They'll argue over who's the real clown. The kids will lose their minds. Parents will be filming it for social media. Nobody's watching the gift table." Bill leaned back and crossed his arms. "That's either the dumbest plan I've ever heard or the most brilliant." "It's both. That's why it works." "And what are we expecting, present-wise? Are we talking good stuff or are we talking socks and educational toys?" "The Henderson kid asked for a PS5, three different Lego sets, and a drone. His parents are loaded and so are their friends." "A drone," Bill whispered reverently. "Keep your head in the game. We grab, we go, we split everything fifty-fifty at the storage unit." "What about the cake?" "What about it?" "Can we take the cake too?" Molly stared at him for a long moment, then slowly smiled. "Bill, this is why I keep you around." She folded the crayon map and slid it into her jacket. They had fifty-three minutes and a bounce house to climb.
created_at
"2026-02-08T16:08:22.479909+00:00"
Sunday, February 8th, 2026 at 4:08:22 PM GMT+00:00
model
"anthropic/claude-4.6-opus-20260205"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1231
moderation_latency
155
generation_time
19300
tokens_prompt
61
tokens_completion
660
native_tokens_prompt
75
native_tokens_completion
759
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.01935
router
(null)
provider_responses
0
id
"f88d2367-b695-4fa4-8933-48ef675a7ca0"
status
200
is_byok
false
latency
1231
endpoint_id
"a8708fb9-08ea-43bb-a3b1-165c87ef09df"
provider_name
"Amazon Bedrock"
model_permaslug
"anthropic/claude-4.6-opus-20260205"
api_type
"completions"
id
"gen-1770566882-kozGAxQJFBTA52oNibLF"
upstream_id
"f88d2367-b695-4fa4-8933-48ef675a7ca0"
total_cost
0.01935
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
Evaluation details
Result
Evaluator
Details
Meta Data
99.9840%
Matches word count
n/a
neededClean
false
words
498
0.7299%
Dialogue to Total Word Ratio
Ratio: 79.92%, Deviation: 9.92%
neededClean
false
wordsTotal
498
wordsDialogue
398
50.3570%