NC Bench
Overview
About
Tests
Models
Model
Claude Opus 4.6
Test
Dialogue tags
Scenario
Write 500 words with 50% dialogue
Run ID
2026-02-08-01KGYYQ0TQANRTZ0TS7EW185ND
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the hand-drawn map across the sticky table of the diner booth, pushing aside a ketchup bottle and what appeared to be a fossilized french fry. The map was detailed to an almost disturbing degree, complete with little crayon stick figures representing the enemy combatants. She tapped a finger on the large rectangle labeled PRESENT TABLE in aggressive red marker. "Alright, Bill, listen up. We've got a forty-five minute window between the magic show and the cake ceremony. That's our strike zone." Bill squinted at the map, turning it sideways, then upside down. His brow furrowed with the intensity of a man trying to defuse a bomb rather than planning to rob a seven-year-old's birthday party. "Why is there a dragon in the backyard?" "That's the bouncy castle. Focus, Bill." "Right, right." He pulled a pen from behind his ear and pointed it at a cluster of stick figures near what appeared to be a picnic table. "What about these hostiles?" "Those are the parents. Most of them will be wine-drunk by two o'clock. Linda Henderson always brings her boxed rosé, and trust me, those women hit it hard. They'll be clustered around the patio gossiping about whose husband got fat. They won't even notice us." Bill nodded slowly, the gears in his head turning with visible effort. He'd pulled off a few impressive jobs in his time, including the great office supply heist of 2019, but children made him nervous. They were unpredictable. They had sticky hands and no respect for personal boundaries. "What about the kids themselves? Those little monsters will be everywhere." "That's what the magic show is for. I hired Spectacular Steve. He does a thing with a rabbit that keeps them hypnotized for a solid twenty minutes. While every eye is on Steve and his rabbit, we back the van up to the side gate, load the presents, and disappear." "You hired a magician as a diversion for a heist? That's either genius or completely insane." "Why can't it be both?" Molly grinned, folding her arms with the satisfaction of a woman who had considered every angle. "I've been casing this party for two weeks, Bill. I know the layout of the Hendersons' backyard better than I know my own apartment." "That's not saying much. You once got lost going to your own bathroom." "That was one time, and I'd had surgery." She stabbed the map again. "Now, you'll be driving. I'll handle the physical extraction of the gifts. We should be looking at roughly thirty to forty presents. Mostly LEGO sets, if the Amazon delivery records I intercepted are accurate." Bill stared at her for a long moment, a mix of admiration and horror playing across his face. She'd intercepted delivery records. For a child's birthday party. "Molly, do you ever worry that we've made some questionable life choices?" "Every single day, Bill. Every single day." She folded the map with crisp, professional creases. "Now finish your coffee. We've got presents to steal."
created_at
"2026-02-08T15:43:09.717053+00:00"
Sunday, February 8th, 2026 at 3:43:09 PM GMT+00:00
model
"anthropic/claude-4.6-opus-20260205"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1142
moderation_latency
166
generation_time
18397
tokens_prompt
61
tokens_completion
641
native_tokens_prompt
75
native_tokens_completion
718
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.018325
router
(null)
provider_responses
0
id
"f87a22a9-6905-4781-a5ec-618d6412f090"
status
200
is_byok
false
latency
1141
endpoint_id
"a8708fb9-08ea-43bb-a3b1-165c87ef09df"
provider_name
"Amazon Bedrock"
model_permaslug
"anthropic/claude-4.6-opus-20260205"
api_type
"completions"
id
"gen-1770565370-cinwJk67DNlIS8OQCgub"
upstream_id
"f87a22a9-6905-4781-a5ec-618d6412f090"
total_cost
0.018325
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
Evaluation details
Result
Evaluator
Details
Meta Data
99.3769%
Matches word count
n/a
neededClean
false
words
505
27.9466%
Dialogue to Total Word Ratio
Ratio: 55.05%, Deviation: 5.05%
neededClean
false
wordsTotal
505
wordsDialogue
278
63.6618%