NC Bench
Overview
About
Tests
Models
Model
Claude Sonnet 4.5
Test
Dialogue tags
Scenario
Write 500 words with 70% dialogue
Run ID
2026-02-08-01KGYZKYPKTZ76SS33W4313JQC
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly pressed her face against the fence slats, squinting at the bouncy castle. "I count twelve presents on that table. Maybe fifteen if those bags count." "The sparkly ones definitely count," Bill said, adjusting his binoculars. "Those are the expensive gifts. Rich parents always use sparkly bags." "How do you know that?" "My cousin does party planning. Trust me." Molly pulled back from the fence. "Okay, so what's our entry point? Front gate's too obvious." "We could go through the neighbor's yard and hop the back fence," Bill suggested. "In broad daylight? At three in the afternoon? Bill, there are like thirty witnesses in there." "Twenty-seven. I counted the kids." "Oh, well, twenty-seven witnesses is so much better." Bill lowered his binoculars. "You got a better idea?" "We wait until they do cake. Everyone crowds around, we slip in during the chaos, grab the haul, and bounce." "What if they see us?" "We dress as clowns." Bill stared at her. "Absolutely not." "Come on, it's perfect! We blend right in. What kid's party doesn't have clowns?" "Molly, I have a very severe fear of clown makeup. I've told you this." "You've never told me that." "I definitely have. Remember the circus job?" "That wasn't about clowns, that was about the tiger nearly eating your leg." "It was both!" Bill insisted. "The clown thing came up during our debrief at the hospital." Molly sighed. "Fine. No clowns. What about birthday mascots? We could say we're with the entertainment company." "Dressed as what?" "I don't know. What do kids like? Dinosaurs? Superheroes?" "I'm not wearing a dinosaur costume in July. I'll die of heatstroke before we even get to the presents." "So superhero then." "Do you even own a superhero costume?" "No, but my nephew does. I could borrow his Spider-Man outfit." Bill raised an eyebrow. "You're going to squeeze into a child's costume?" "It's very stretchy material, Bill. Very accommodating." "This is getting ridiculous. Why don't we just come back tonight?" "Because by tonight all those presents will be opened and gone," Molly said. "We need them wrapped and untracked. That's the whole point." "Right, right. The eBay angle." "Exactly. Unopened toys, premium prices, no serial numbers to trace." Bill peered through the fence again. "Hold on. Is that a pony?" "Where?" "By the shed. They have an actual pony at this party." Molly looked. "That's definitely a pony." "We could steal the pony instead." "Bill, we're not stealing a pony." "Why not? Pony's worth more than those presents combined." "Where exactly would we fence a stolen pony?" Bill paused. "I hadn't thought that through." "Can we please focus? Cake time is in fifteen minutes." "Fine. But I'm going on record saying the pony was a good idea."
created_at
"2026-02-08T15:58:57.713875+00:00"
Sunday, February 8th, 2026 at 3:58:57 PM GMT+00:00
model
"anthropic/claude-4.5-sonnet-20250929"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1644
moderation_latency
153
generation_time
18163
tokens_prompt
61
tokens_completion
645
native_tokens_prompt
75
native_tokens_completion
768
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.011745
router
(null)
provider_responses
0
id
"f60f547f-b7a8-4a17-b298-cee2994131eb"
status
200
is_byok
false
latency
1644
endpoint_id
"d8bee848-f020-454a-b2be-4a7ce868f99f"
provider_name
"Amazon Bedrock"
model_permaslug
"anthropic/claude-4.5-sonnet-20250929"
api_type
"completions"
id
"gen-1770566318-3E1uYwN4MuBL7rGAmFnX"
upstream_id
"f60f547f-b7a8-4a17-b298-cee2994131eb"
total_cost
0.011745
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
Evaluation details
Result
Evaluator
Details
Meta Data
0.0000%
Matches word count
n/a
neededClean
false
words
460
0.0000%
Dialogue to Total Word Ratio
Ratio: 88.26%, Deviation: 18.26%
neededClean
false
wordsTotal
460
wordsDialogue
406
0.0000%