NC Bench
Overview
About
Tests
Models
Model
Claude Opus 4.6
Test
Dialogue tags
Scenario
Write 500 words with 30% dialogue
Run ID
2026-02-08-01KGYYAJ2YPEWBXJ9TKKYN3RTM
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the hand-drawn map across the sticky table of the Waffle House booth, smoothing out the crayon-marked edges with the reverence of a general surveying a battlefield. The map had been drawn by her seven-year-old nephew, which meant the swimming pool was roughly the size of Texas and the house itself looked like it had been designed by a drunk spider. "The target is here," she said, tapping a lopsided rectangle labeledPRESINT TABLE in green crayon. "Right between the bounce house and the face-painting station." Bill leaned forward, squinting through his reading glasses. He was a large man, the kind of large that made chairs nervous, and he had the weathered face of someone who had seen things, most of them disappointing. "That's a lot of kids, Molly. I don't like kids. They're unpredictable. Worse than dogs." "They're six-year-olds, Bill. They're predictable in exactly one way. Sugar makes them insane, and insanity makes them blind to their surroundings. That's our window." She pulled out a folded sheet of notebook paper and unfolded it with a crisp snap. It was the itinerary, obtained through three days of surveillance on a Facebook group called Oakdale Moms Who Brunch. The party was for a child named Braxton, whose mother had posted no fewer than forty-seven updates about the superhero theme, the custom cake, and the gift registry that included items no six-year-old could reasonably need, including a leather-bound journal. "Cake gets served at two fifteen. PiƱata goes up at two thirty. That gives us a thirteen-minute window where every child, parent, and hired Elsa impersonator will be clustered around the dessert table in the backyard." "And the presents?" "Front porch. Stacked on a folding table with a tablecloth that says Happy Birthday in a font that should be illegal. I've counted at least thirty-two confirmed RSVPs, which means thirty-two gifts, minimum. These Oakdale parents overcompensate. We're looking at high-value Lego sets, Bill. Possibly even a Nintendo Switch." Bill rubbed his jaw slowly. He had once broken into a diamond exchange in Antwerp using nothing but a paperclip and an alarming amount of confidence, but something about this job unsettled him in a way he couldn't articulate. "What's our exit strategy?" "We drive a white van marked with a Party Rentals logo. I had the magnetic signs made at the print shop on Fifth. We load the presents, smile at anyone who looks our way, and say we're transporting the gifts inside per the mother's request. Nobody questions a woman with a clipboard." "You have a clipboard?" "I have three clipboards, Bill. I'm a professional." He stared at her for a long moment, then picked up his coffee and took a slow, defeated sip. The plan was absurd. It was beneath them. It was morally questionable in ways that their usual work somehow wasn't. He set the mug down. "I'm going to need a disguise." "Obviously. I got you a Spider-Man shirt in triple XL." "I hate you, Molly." "You love me. Eat your waffle."
created_at
"2026-02-08T15:36:22.862322+00:00"
Sunday, February 8th, 2026 at 3:36:22 PM GMT+00:00
model
"anthropic/claude-4.6-opus-20260205"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1204
moderation_latency
155
generation_time
19696
tokens_prompt
61
tokens_completion
647
native_tokens_prompt
75
native_tokens_completion
761
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0194
router
(null)
provider_responses
0
id
"17938a72-8eae-45eb-bcaf-8ee1ef52b25d"
status
200
is_byok
false
latency
1204
endpoint_id
"a8708fb9-08ea-43bb-a3b1-165c87ef09df"
provider_name
"Amazon Bedrock"
model_permaslug
"anthropic/claude-4.6-opus-20260205"
api_type
"completions"
id
"gen-1770564962-BREtlYRck02WxEG3DME8"
upstream_id
"17938a72-8eae-45eb-bcaf-8ee1ef52b25d"
total_cost
0.0194
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
Evaluation details
Result
Evaluator
Details
Meta Data
35.0022%
Matches word count
n/a
neededClean
false
words
518
0.0001%
Dialogue to Total Word Ratio
Ratio: 46.72%, Deviation: 16.72%
neededClean
false
wordsTotal
518
wordsDialogue
242
17.5011%