NC Bench
Overview
About
Tests
Models
Model
Claude Opus 4.6
Test
Dialogue tags
Scenario
Write 500 words with 50% dialogue
Run ID
2026-02-08-01KGZ0AX0AJ65TA23YA2PK0A78
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the hand-drawn map across the hood of the minivan and anchored one corner with a juice box she'd swiped from a gas station. The crayon lines were barely legible, but they represented three days of surveillance and one very uncomfortable afternoon hiding inside a bouncy castle. "Walk me through it again," Bill said, adjusting the fake mustache that made him look less like a father and more like a walrus having an allergic reaction. "Alright, listen. The party starts at two. We've got a forty-five-minute window between the arrival of the presents and the arrival of the cake. Once that cake shows up, every kid in there is going to be swarming the table, and we lose our shot." "What kind of cake?" "What does it matter what kind of cake?" "It matters because if it's an ice cream cake, they'll rush the table faster. Gives us less time. If it's one of those fondant nightmares shaped like a dinosaur, the kids will just stare at it and nobody will want the first slice. That buys us extra minutes." Molly pinched the bridge of her nose. "It's a sheet cake from Costco. Can we focus?" "A Costco cake. So we're dealing with a budget-conscious family. That means the presents won't be anything spectacular either. Are we sure this is worth the risk?" "There are thirty-two children attending this party, Bill. Thirty-two. Even if half the parents cheap out, we're looking at solid mid-range toys, gift cards, and at least four duplicate LEGO sets that nobody will even miss." Bill nodded slowly, the math apparently working in his favor for once. "Okay. Entry point?" "Side gate by the garage. It's got one of those child-proof latches." "Those things are impossible." "You cracked a Moretti safe in under six minutes." "The Moretti safe didn't require me to push down and pull up at the same time while squeezing two plastic tabs. Those latches were designed by sadists." Molly reached into her pocket and produced a paper plate with a crudely drawn diagram of the backyard. "Once we're inside, I'll pose as a party entertainer. I've got a bag of balloon animals ready to go." "Can you actually make balloon animals?" "I can make snakes." "Snakes are just uninflated balloons, Molly." "And children are easily impressed. While I'm distracting the crowd by the swing set, you take the hand truck around to the gift table, load everything into the bags, and wheel it out through the garage. The whole thing should take seven minutes." Bill studied the paper plate, rotating it twice before he realized the ketchup stain was not, in fact, a tactical marking. He set it down and took a breath. "What about the parents?" "They'll be on their phones. They're always on their phones." "And the birthday kid?" Molly hesitated for just a moment, then squared her shoulders. "Acceptable collateral. We'll send a card."
created_at
"2026-02-08T16:11:29.724799+00:00"
Sunday, February 8th, 2026 at 4:11:29 PM GMT+00:00
model
"anthropic/claude-4.6-opus-20260205"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1463
moderation_latency
155
generation_time
18413
tokens_prompt
61
tokens_completion
620
native_tokens_prompt
75
native_tokens_completion
710
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.018125
router
(null)
provider_responses
0
id
"b761b44e-c866-4183-b4f3-fe177c7cf2b1"
status
200
is_byok
false
latency
1462
endpoint_id
"a8708fb9-08ea-43bb-a3b1-165c87ef09df"
provider_name
"Amazon Bedrock"
model_permaslug
"anthropic/claude-4.6-opus-20260205"
api_type
"completions"
id
"gen-1770567070-NfLeFdwcZGv5nRT8uPLS"
upstream_id
"b761b44e-c866-4183-b4f3-fe177c7cf2b1"
total_cost
0.018125
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
Evaluation details
Result
Evaluator
Details
Meta Data
98.7124%
Matches word count
n/a
neededClean
false
words
494
0.0000%
Dialogue to Total Word Ratio
Ratio: 70.04%, Deviation: 20.04%
neededClean
false
wordsTotal
494
wordsDialogue
346
49.3562%