NC Bench
Overview
About
Tests
Models
Model
Claude Sonnet 4.5
Test
Dialogue tags
Scenario
Write 500 words with 30% dialogue
Run ID
2026-02-08-01KGYYP3PF5EEGAPTQYRC3WPX3
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the blueprints across the hood of their idling van, her finger tracing the layout of the suburban backyard. The sound of children's laughter and off-key singing drifted over the fence, punctuated by the occasional shriek of sugar-fueled excitement. Bill peered over her shoulder, squinting at the hand-drawn map. His partner had somehow obtained detailed schematics of the Henderson family's property, complete with annotations like "probable bouncy castle location" and "cake table - avoid at all costs." "You're absolutely sure about this?" Bill adjusted his fake mustache, which kept sliding down his sweaty upper lip. "When have I ever steered us wrong?" "Last month you had us rob a storage unit full of beanie babies. The month before that, we drove six hours to steal a truck of what you promised was caviar but turned out to be fish food." Molly waved dismissively. The party was hitting peak chaos. Through the fence slats, she could see a clown making balloon animals while parents clustered near the beverage table, their attention delightfully elsewhere. The present table sat unguarded in the corner, a glorious mountain of wrapped boxes and gift bags. "This is different. That kid's birthday list went viral. Every tech company sent promotional gifts. We're talking tablets, gaming systems, maybe even a drone." "He's turning seven." "Rich seven." Bill wiped his forehead with a handkerchief already soaked through. The polyester clown suit Molly had insisted he wear was like a portable sauna. She'd claimed it would help him blend in, conveniently forgetting that the Hendersons had already hired entertainment. The plan was straightforward, which meant it would inevitably become complicated. Molly would create a distraction at the front of the house while Bill scaled the fence, grabbed as many presents as possible, and retreated before anyone noticed. Simple. Clean. Probably disastrous. "What's my distraction again?" Molly asked, already moving toward the fence line. "You're supposed to be a singing telegram. You have the costume in the van." "Right, right." She didn't sound confident. Bill watched her rummage through the back of the van, emerging with what appeared to be a gorilla suit missing one arm. She'd definitely grabbed the wrong costume from the rental place. Again. "Molly, that's not—" "It'll be fine!" She was already pulling it on over her clothes, hopping awkwardly as she wrestled with the leg portions. The missing arm flopped pathetically against her side. A particularly loud cheer erupted from the party. The piñata had apparently made its appearance. Their window was closing. Bill took a deep breath, his balloon-animal-covered shoulders rising and falling. Somewhere in the universe, there had to be a version of him with a legitimate career and a 401k. That Bill probably didn't spend his Saturdays dressed as a clown, preparing to rob children. "On three?" Molly's voice was muffled inside the gorilla head. "On three." They moved toward the fence.
created_at
"2026-02-08T15:42:37.823663+00:00"
Sunday, February 8th, 2026 at 3:42:37 PM GMT+00:00
model
"anthropic/claude-4.5-sonnet-20250929"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1378
moderation_latency
153
generation_time
16036
tokens_prompt
61
tokens_completion
625
native_tokens_prompt
75
native_tokens_completion
711
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.01089
router
(null)
provider_responses
0
id
"309072fb-1bfa-4413-8b08-7e034eb8aa92"
status
200
is_byok
false
latency
1377
endpoint_id
"d8bee848-f020-454a-b2be-4a7ce868f99f"
provider_name
"Amazon Bedrock"
model_permaslug
"anthropic/claude-4.5-sonnet-20250929"
api_type
"completions"
id
"gen-1770565340-mVIM41WGPd07tvlELEAb"
upstream_id
"309072fb-1bfa-4413-8b08-7e034eb8aa92"
total_cost
0.01089
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
Evaluation details
Result
Evaluator
Details
Meta Data
51.9255%
Matches word count
n/a
neededClean
false
words
484
23.1044%
Dialogue to Total Word Ratio
Ratio: 24.59%, Deviation: 5.41%
neededClean
false
wordsTotal
484
wordsDialogue
119
37.5149%