NC Bench
Overview
About
Tests
Models
Model
Claude Opus 4.6
Test
Dialogue tags
Scenario
Write 500 words with 30% dialogue
Run ID
2026-02-08-01KGZ089JQE7ZR6J1H37PVB3KT
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the hand-drawn map across the sticky surface of the diner table, pushing aside a half-eaten plate of onion rings to make room. The blueprint, if you could call it that, had been sketched on the back of a Denny's placemat in purple crayon. She'd stolen the crayon from a toddler at the next booth, which she considered a warm-up exercise. Bill leaned forward and squinted at the diagram, turning it sideways, then upside down, then back again. His brow furrowed with the intensity of a man trying to read ancient Sanskrit rather than a crude drawing of a suburban backyard. "Okay, so the party starts at two," Molly said, tapping the center of the map where she'd drawn a wobbly rectangle labeled BOUNCY KASTLE. "The presents will be on a folding table near the back porch. Based on my surveillance, we're looking at approximately thirty-one gifts, mostly mid-range, probably a lot of Legos and gift cards." "You surveilled a seven-year-old's birthday party?" "I surveilled the Evite. The mother posted it publicly on Facebook like an absolute amateur." Bill scratched the back of his neck. He'd pulled off a diamond heist in Antwerp, cracked a safe in Monaco while hanging upside down from a ventilation shaft, and once stolen a painting from a museum while the security guard watched and simply assumed he was an authorized mover. None of those jobs had made him feel quite as morally conflicted as this one. "Molly, I gotta ask. Why are we robbing children?" "We're not robbing children, Bill. We're robbing a table. The children will be twenty feet away, hopped up on sheet cake and orange soda, completely oblivious. It's a victimless crime." "It's literally not. The victim is a child named Brandon." Molly waved her hand dismissively. The fluorescent diner lights caught the faded tattoo on her wrist, a small skeleton key she'd gotten after her first job. She'd been nineteen then, stealing wallets from distracted tourists. Now she was forty-three and apparently descending further down the moral food chain. "Brandon will be fine. His parents are loaded. They'll replace everything by Monday. Meanwhile, we flip the gifts online, clear maybe eight hundred bucks, and use that to fund the real job." Bill considered this. The real job was a payroll vault in Midtown, and they did need seed money for equipment. He picked up an onion ring and chewed it thoughtfully, the grease mixing with his growing sense of shame. "What about the clown?" he asked. "What clown?" "She hired a clown. It was on the Evite you apparently memorized. What if the clown sees us?" Molly smiled the way she always smiled before a job, with the serene confidence of someone who had never once considered the possibility of failure. "Nobody takes a clown seriously, Bill. That's the whole point of clowns." Bill opened his mouth to argue, then closed it. She wasn't wrong. She was never wrong. That was the most annoying thing about Molly. "Fine," he said. "But I'm not wearing the disguise." "You're wearing the disguise."
created_at
"2026-02-08T16:10:09.398462+00:00"
Sunday, February 8th, 2026 at 4:10:09 PM GMT+00:00
model
"anthropic/claude-4.6-opus-20260205"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1819
moderation_latency
(null)
generation_time
23373
tokens_prompt
61
tokens_completion
659
native_tokens_prompt
75
native_tokens_completion
760
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.019375
router
(null)
provider_responses
0
id
"msg_01JkmspTr4S9ehFVbnV7zVEL"
status
200
is_byok
false
latency
1805
endpoint_id
"c926c048-563e-44b0-8434-95965bee924c"
provider_name
"Anthropic"
model_permaslug
"anthropic/claude-4.6-opus-20260205"
api_type
"completions"
id
"gen-1770566985-b463MjIr6LOFdWaCqHik"
upstream_id
"msg_01JkmspTr4S9ehFVbnV7zVEL"
total_cost
0.019375
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Anthropic"
Evaluation details
Result
Evaluator
Details
Meta Data
27.1658%
Matches word count
n/a
neededClean
false
words
519
8.6643%
Dialogue to Total Word Ratio
Ratio: 36.99%, Deviation: 6.99%
neededClean
false
wordsTotal
519
wordsDialogue
192
17.9151%