NC Bench
Overview
About
Tests
Models
Model
Claude Opus 4.5
Test
Dialogue tags
Scenario
Write 500 words with 70% dialogue
Run ID
2026-02-08-01KGYYD9ZM2ETCAJ6383HQSKHJ
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the crayon-drawn map across the sticky diner table and tapped her finger on a wobbly circle labeled PREZENTS. Bill squinted at it. "Where did you get this?" "I paid a six-year-old informant two dollars and a fruit roll-up. Kid's got connections." "The spelling concerns me, Molly." "She's in kindergarten, Bill. Focus on the mission." Bill stirred his coffee and sighed. "Okay, walk me through it again. We're really doing this?" "Little Timmy Patterson's seventh birthday party. Saturday at two. His dad runs that hedge fund that foreclosed on my grandmother's bingo hall." "So this is revenge?" "This is justice served with a side of balloon animals." Bill studied the map again. "What's this rectangle by the door?" "Bouncy castle. That's our entry point." "We're going to bounce our way in?" "No, dummy. We use it as cover. Everyone watches the bouncy castle. It's like a magnet for eyeballs. Meanwhile, we slip through the back near the juice box station." "And the presents?" "Stacked on a folding table by the piƱata. We grab, we bag, we vanish." Bill leaned back. "What about security?" "There's a golden retriever named Biscuit." "That's it?" "And maybe a clown." "I hate clowns." "Everyone hates clowns, Bill. That's why I packed pepper spray." "You're going to pepper spray a clown at a children's party?" "Only if he gets aggressive. You never know with clowns. They're unpredictable." Bill rubbed his temples. "What's our exit strategy?" "Through the garage. I've already scouted it. They keep it unlocked because Mr. Patterson is too cheap to fix the automatic opener." "And transportation?" "I borrowed my nephew's wagon." "A wagon? For a heist?" "It's a Radio Flyer, Bill. Classic. Inconspicuous. No one suspects the wagon." "I suspect the wagon." "Well, stop. It's a good wagon." Bill drummed his fingers on the table. "What kind of presents are we talking about here?" "Rich kid presents. We're looking at premium Legos, maybe a drone, definitely some video games. I heard someone's bringing a telescope." "A telescope?" "Little Timmy wants to be an astronaut. Too bad his dad wants to be a monster who ruins old ladies' bingo nights." "Your grandmother's really torn up about this, huh?" "She cried into her daubers for three weeks, Bill. Three weeks." "Okay, I'm in. But I want hazard pay for the clown situation." "You'll get an extra juice box." "Make it two." "Deal." Molly folded up the map and tucked it into her jacket. "Oh, and we're wearing disguises." "What kind of disguises?" "Parent casual. Khakis and exhaustion." "I can do exhaustion." "I know you can. That's why I picked you for this job." Bill finished his coffee. "Saturday at two?" "Saturday at two. Don't be late." "I'm never late." "You were late to the aquarium job." "The penguin exhibit closed early. That wasn't my fault." "Sure it wasn't."
created_at
"2026-02-08T15:37:50.529582+00:00"
Sunday, February 8th, 2026 at 3:37:50 PM GMT+00:00
model
"anthropic/claude-4.5-opus-20251124"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1997
moderation_latency
159
generation_time
17164
tokens_prompt
61
tokens_completion
680
native_tokens_prompt
75
native_tokens_completion
813
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0207
router
(null)
provider_responses
0
id
"e025f290-9b45-49a8-8c32-dbae64113119"
status
200
is_byok
false
latency
1997
endpoint_id
"de32581c-2e11-4f54-8d29-e8a6e19d3038"
provider_name
"Amazon Bedrock"
model_permaslug
"anthropic/claude-4.5-opus-20251124"
api_type
"completions"
id
"gen-1770565052-oHz4DwScxADbzxTk3Do0"
upstream_id
"e025f290-9b45-49a8-8c32-dbae64113119"
total_cost
0.0207
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
Evaluation details
Result
Evaluator
Details
Meta Data
9.6081%
Matches word count
n/a
neededClean
false
words
478
0.0001%
Dialogue to Total Word Ratio
Ratio: 86.40%, Deviation: 16.40%
neededClean
false
wordsTotal
478
wordsDialogue
413
4.8041%