NC Bench
Overview
About
Tests
Models
Model
Claude Haiku 4.5
Test
Dialogue tags
Scenario
Write 500 words with 50% dialogue
Run ID
2026-02-08-01KGYYYE8JD6C22PF7ZJ6VEFM1
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
# The Birthday Caper Molly spread the floor plan across the kitchen table, her finger tracing the layout of Riverside Community Center. Bill leaned in with a steaming cup of coffee, studying the blueprint like it was the Mona Lisa. "So here's the thing," Molly said, tapping the main hall. "The birthday kid, Tommy, opens presents at exactly three o'clock. We need to be in and out in forty-five minutes." "Forty-five minutes?" Bill whistled low. "That's ambitious. Last time we talked about a gift heist, you said we needed at least two hours." "That was for a wedding. Children operate on a different schedule. Shorter attention spans, faster cake consumption." Molly pulled out a stack of photos showing the party setup from their reconnaissance mission. "Look, the presents are stacked in that corner by the pirate decorations." Bill adjusted his glasses, squinting at the images. "Why do six-year-olds need twelve presents? When I was a kid, I got one gift and a handful of candy corn." "Inflation, inflation, inflation. Parents spend serious money now." Molly circled the emergency exit in red pen. "We enter through here at 2:55. I'll distract the parents with a 'water main emergency' announcement. You slip in, grab the biggest boxes first, and load them into the van." "Wait, what do you mean you'll announce it?" Bill frowned. "Are you going undercover as a city official again? Because last time, they almost arrested you for impersonating a plumber." "That was one time, and I had a very convincing mustache." Molly grinned. "This time, I'm the worried parent. I'll rush in screaming about a gas leak in the bathrooms. Chaos ensues. We slip out in the confusion." Bill stroked his chin thoughtfully. "What about the gift tags? Don't we need to know which presents are the expensive ones? I don't want to steal a twelve-dollar action figure when there's a limited-edition gaming console somewhere." "That's why I photocopied the gift registry." Molly produced another document with a flourish. "Nintendo Switch, a hoverboard, and some sort of robot dog. Those three alone are worth five hundred bucks." "A hoverboard at six years old?" Bill shook his head in disbelief. "His parents are insane. What happened to building character through deprivation?" Molly rolled up the floor plan. "Times have changed. Besides, we're not keeping them. Once we fence these at the discount warehouse on Maple Street, we'll have enough for our next big job." Bill stood up, draining his coffee. "Alright, I'm in. But if anyone asks later, I want it on record that I objected to the gas leak story. It's too theatrical." "Noted," Molly said, already checking her watch. "We've got three days to prepare. Go buy yourself a nondescript parent costume." Bill grabbed his jacket and paused at the door. "This better work, Molly. I'm starting to think we picked the wrong career path."
created_at
"2026-02-08T15:47:04.870615+00:00"
Sunday, February 8th, 2026 at 3:47:04 PM GMT+00:00
model
"anthropic/claude-4.5-haiku-20251001"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1086
moderation_latency
2393
generation_time
7912
tokens_prompt
61
tokens_completion
642
native_tokens_prompt
75
native_tokens_completion
720
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.003675
router
(null)
provider_responses
0
id
"586eebea-0184-4a9b-a9fc-ad377b78ad6d"
status
200
is_byok
false
latency
1086
endpoint_id
"d000261c-5aeb-47d1-ac87-ad64dbe1cdb6"
provider_name
"Amazon Bedrock"
model_permaslug
"anthropic/claude-4.5-haiku-20251001"
api_type
"completions"
id
"gen-1770565613-c82tEyDhkJQsJejweP9H"
upstream_id
"586eebea-0184-4a9b-a9fc-ad377b78ad6d"
total_cost
0.003675
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
Evaluation details
Result
Evaluator
Details
Meta Data
51.9255%
Matches word count
n/a
neededClean
false
words
484
0.0000%
Dialogue to Total Word Ratio
Ratio: 73.35%, Deviation: 23.35%
neededClean
false
wordsTotal
484
wordsDialogue
355
25.9628%