NC Bench
Overview
About
Tests
Models
Model
Claude Opus 4.5
Test
Dialogue tags
Scenario
Write 500 words with 70% dialogue
Run ID
2026-02-08-01KGZ01MGG94N6PF7Y9SBRWHFA
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the hand-drawn map across the sticky diner table, nearly knocking over Bill's coffee in the process. "So the target is here," she said, jabbing her finger at a crayon-colored rectangle. "Little Timmy Henderson's seventh birthday party. Saturday at two." Bill squinted at the map. "Did you draw this on a placemat?" "Focus, Bill. This is the big one." "It's a children's party, Molly." "Exactly. Do you have any idea how many presents a seven-year-old gets these days? We're talking tablets, gaming consoles, those little robot dogs that flip. The haul could be massive." Bill rubbed his temples. "I went to prison for eighteen months. Eighteen months, Molly. And you want me to risk my freedom for a pile of toys wrapped in Paw Patrol paper?" "First of all, the paper is probably Minecraft themed. Timmy's demographic skews older. Second, nobody suspects party thieves. It's a victimless crime." "The children are literally the victims." "Children are resilient. Besides, his parents are loaded. They'll just buy more stuff. We're basically redistributing wealth." "To ourselves." "To ourselves, yes. That's how redistribution works, Bill." A waitress refilled their coffees. They smiled politely until she left. "Walk me through the plan," Bill sighed. Molly's eyes lit up. "Okay, so the party starts at two. The presents will be piled on a table near the bounce house. That's our primary extraction point." "You want me to fight through a bounce house full of second graders?" "No, no. The bounce house is a distraction. Kids can't resist it. While they're bouncing, we slip in through the side gate, grab the presents, and load them into the van." "What van?" "The van I'm going to rent. Try to keep up." "And nobody notices two adults stealing an enormous pile of wrapped gifts?" "We'll be wearing disguises. I got us matching clown costumes." Bill stared at her for a long moment. "Clown costumes." "Nobody questions clowns at a birthday party, Bill. It's psychological camouflage." "I'm not wearing a clown costume." "Fine, you can be the balloon guy. Same principle." "What about the parents? The other adults?" "They'll be inside drinking wine and complaining about property taxes. Trust me, I've done surveillance. These people don't supervise anything." "You've been watching children's parties?" "For reconnaissance purposes only. Don't make it weird." Bill looked at the placemat map again. Someone had drawn a dinosaur in the corner eating what appeared to be a birthday cake. "What's our exit strategy?" "We walk calmly to the van with our arms full of presents. If anyone asks, we say we're taking them inside for the big reveal. People love a big reveal." "This is insane." "This is genius. There's a fine line." "There really isn't." Molly folded the map carefully, tucking it into her jacket. "Saturday at two. Wear comfortable shoes. Those little driveways are murder on the knees."
created_at
"2026-02-08T16:06:25.609826+00:00"
Sunday, February 8th, 2026 at 4:06:25 PM GMT+00:00
model
"anthropic/claude-4.5-opus-20251124"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
2005
moderation_latency
140
generation_time
17814
tokens_prompt
61
tokens_completion
645
native_tokens_prompt
75
native_tokens_completion
758
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.019325
router
(null)
provider_responses
0
id
"bfb09618-6315-4d9a-9aba-8aa51dd80012"
status
200
is_byok
false
latency
2004
endpoint_id
"de32581c-2e11-4f54-8d29-e8a6e19d3038"
provider_name
"Amazon Bedrock"
model_permaslug
"anthropic/claude-4.5-opus-20251124"
api_type
"completions"
id
"gen-1770566767-1tX8aWZPNeJC4DccPBUT"
upstream_id
"bfb09618-6315-4d9a-9aba-8aa51dd80012"
total_cost
0.019325
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
Evaluation details
Result
Evaluator
Details
Meta Data
20.1897%
Matches word count
n/a
neededClean
false
words
480
0.6738%
Dialogue to Total Word Ratio
Ratio: 80.00%, Deviation: 10.00%
neededClean
false
wordsTotal
480
wordsDialogue
384
10.4317%