NC Bench
Overview
About
Tests
Models
Model
Claude Haiku 4.5
Test
Dialogue tags
Scenario
Write 500 words with 30% dialogue
Run ID
2026-02-08-01KGYYMFTPYYYV73333JA0G766
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
# The Birthday Bandit Blueprint Molly spread the hastily drawn floor plan across the diner table, knocking over Bill's coffee cup in the process. Neither of them seemed to notice the brown liquid creeping toward the napkin dispenser. "This is insane," Bill muttered, adjusting his glasses to examine the blueprint more closely. "We're talking about a children's birthday party. At Chuck E. Cheese's." "Exactly," Molly said, her eyes gleaming with the kind of intensity usually reserved for museum heists. "Nobody suspects birthday party bandits. We'll be invisible." "We'll be arrested," Bill countered, though he was already sketching notes in the margins. "What's even in there? Video game tickets? Cardboard pizza boxes? A singing mechanical rat that stopped working in 2003?" Molly tapped the blueprint with her index finger, leaving a small grease stain from the french fries she'd been eating. "Twenty-three wrapped presents, Bill. Premium ones. I scoped it out last weekend. Parents competing to buy the most expensive toys for little Timmy Henderson's seventh birthday party. We're talking remote control helicopters, limited edition action figures, probably a Nintendo Switch or two." "How do you even know all this?" Bill leaned back in his chair, the metal legs scraping against the linoleum floor in a sound that made everyone in the diner wince. "I told you I had a source. Timmy's older sister goes to my yoga class. She mentioned it." Molly smiled smugly. "I even found out the party runs from two to four on Saturday afternoon." Bill studied the floor plan, which appeared to have been drawn on what looked like a napkin from another restaurant, then scanned and enlarged. The print quality was questionable at best. "So how exactly do we get past the security? That place has more cameras than a Las Vegas casino." "We don't get past them," Molly said, standing up and beginning to pace. "We become invisible to them. You go in as a parent. Sit in the party room. Be inconspicuous." "I look nothing like a parent," Bill protested. "I'm thirty-eight years old with no children and the fashion sense of a garage sale." "Exactly," Molly said. "That makes you perfect. The forgettable type. While you're doing that, I'll be in the arcade area, working the gift table. I'll have the presents relocated to a duffel bag by two forty-five." Bill rubbed his temples. "This is the worst plan we've ever attempted, and we once tried to steal a zamboni from a roller skating rink." "That was actually going great until the manager came back early," Molly reminded him. "So what could possibly go wrong here?" "Nothing," Molly said confidently. "It's foolproof." Bill looked at the napkin blueprint again, then at Molly's determined expression, and sighed deeply. He'd learned long ago that once Molly had set her mind to something, especially something absolutely ridiculous, resistance was entirely futile. "What time should I wear the fake mustache?"
created_at
"2026-02-08T15:41:36.218262+00:00"
Sunday, February 8th, 2026 at 3:41:36 PM GMT+00:00
model
"anthropic/claude-4.5-haiku-20251001"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
580
moderation_latency
252
generation_time
7464
tokens_prompt
61
tokens_completion
633
native_tokens_prompt
75
native_tokens_completion
715
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.00365
router
(null)
provider_responses
0
id
"2cacf79a-a361-4323-8002-8b397e8b7d3d"
status
200
is_byok
false
latency
580
endpoint_id
"d000261c-5aeb-47d1-ac87-ad64dbe1cdb6"
provider_name
"Amazon Bedrock"
model_permaslug
"anthropic/claude-4.5-haiku-20251001"
api_type
"completions"
id
"gen-1770565287-jvuVPZxJmc54aIWoyO2B"
upstream_id
"2cacf79a-a361-4323-8002-8b397e8b7d3d"
total_cost
0.00365
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
Evaluation details
Result
Evaluator
Details
Meta Data
75.1556%
Matches word count
n/a
neededClean
false
words
487
0.0000%
Dialogue to Total Word Ratio
Ratio: 56.67%, Deviation: 26.67%
neededClean
false
wordsTotal
487
wordsDialogue
276
37.5778%