NC Bench
Overview
About
Tests
Models
Model
Claude Opus 4.6
Test
Dialogue tags
Scenario
Write 500 words with 50% dialogue
Run ID
2026-02-08-01KGYYYMBTV3ZBZTFK77T0VKK2
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the hand-drawn map across the hood of the van, smoothing out the crayon-stained edges with her palm. The blueprint had cost them three juice boxes and a pack of fruit snacks, extracted from a surprisingly shrewd six-year-old informant named Timmy. "Okay, here's the situation," she said, tapping the map with a ballpoint pen. "The party kicks off at two. The presents will be stacked on a folding table near the bounce house. We've got a window of maybe twenty minutes during the piƱata phase when every kid and parent will be in the backyard." Bill leaned against the van and crossed his arms. He was wearing a purple polo shirt with a cartoon dinosaur embroidered on the chest, which Molly had insisted was essential to blending in. "I still don't understand why we can't just walk in the front door," he said. "Because, Bill, Janet Henderson is running security at the sign-in table, and that woman has the memory of an elephant crossed with a detective. She knows every parent in a four-mile radius. If she doesn't recognize you, she'll ask questions." "So what's the play?" "You're going in as Chuckles." Bill stared at her for a long moment. His left eye twitched. "I'm not wearing the clown suit again, Molly." "You are absolutely wearing the clown suit again." "Last time I wore that thing, a toddler bit me. On the knee. I still have the mark." "That's because you broke character. You stopped honking the horn." "The horn was broken!" "A good clown improvises, Bill." He rubbed his face with both hands, the dinosaur on his shirt stretching with the motion. Molly could see him cycling through the five stages of grief in roughly four seconds, finally arriving at acceptance with a heavy sigh that fogged the van window. "Fine. I'm Chuckles. Then what?" Molly grinned and circled a section of the map where Timmy had drawn what appeared to be either a dragon or a golden retriever. "You keep them distracted with balloon animals in the backyard. I come in through the side gate dressed as a delivery person, load the presents into the dolly, and wheel everything out to the van. Simple, clean, elegant." "What about the dog?" Bill asked, pointing to the ambiguous creature on the map. "That's their dog, Biscuit. Timmy says he's friendly." "Timmy also said his sister was a vampire, so forgive me if I don't trust his intel completely." Molly folded the map and tucked it into her jacket pocket. The afternoon sun caught the sequins on the clown suit hanging in the van's back window, throwing tiny rainbows across Bill's miserable face. She placed a hand on his shoulder with the gravity of a general addressing her finest soldier. "Bill, there are thirty-two presents on that table. Thirty-two. That's enough to fence for rent and groceries for two months. All you have to do is make balloon animals and not get bitten." Bill reached for the clown suit. "If a kid bites me, I'm retiring."
created_at
"2026-02-08T15:47:19.599305+00:00"
Sunday, February 8th, 2026 at 3:47:19 PM GMT+00:00
model
"anthropic/claude-4.6-opus-20260205"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1478
moderation_latency
155
generation_time
18633
tokens_prompt
61
tokens_completion
644
native_tokens_prompt
75
native_tokens_completion
744
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.018975
router
(null)
provider_responses
0
id
"90920cf6-aa16-40ad-8fe8-0092462a9eb6"
status
200
is_byok
false
latency
1477
endpoint_id
"a8708fb9-08ea-43bb-a3b1-165c87ef09df"
provider_name
"Amazon Bedrock"
model_permaslug
"anthropic/claude-4.6-opus-20260205"
api_type
"completions"
id
"gen-1770565620-UpRmpFZA37cRcqnVLp5j"
upstream_id
"90920cf6-aa16-40ad-8fe8-0092462a9eb6"
total_cost
0.018975
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
Evaluation details
Result
Evaluator
Details
Meta Data
68.1022%
Matches word count
n/a
neededClean
false
words
514
46.9065%
Dialogue to Total Word Ratio
Ratio: 53.89%, Deviation: 3.89%
neededClean
false
wordsTotal
514
wordsDialogue
277
57.5044%