NC Bench
Overview
About
Tests
Models
Model
Claude Haiku 4.5
Test
Dialogue tags
Scenario
Write 500 words with 30% dialogue
Run ID
2026-02-08-01KGYZ27E3FFD9186GSVZQ3A3G
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
# The Birthday Job Molly spread the blueprints across the kitchen table, her reading glasses sliding down her nose as she studied the layout of the community center. Bill entered carrying two cups of coffee, setting one beside her with the precision of a man who had learned to be useful after forty years of being anything but. "So what are we looking at?" Bill asked, pulling up a chair. "Timmy Henderson's eighth birthday party," Molly said, tapping the gymnasium entrance. "Sixty-three presents stacked by the east wall, professionally wrapped, total estimated value around fifteen hundred dollars." Bill whistled low. "That's a lot of toy soldiers and action figures." "Not for us, it isn't," Molly replied. She had that look in her eyes, the one that meant she'd already calculated twelve different escape routes and three contingency plans. "We go in through the kitchen entrance at 2:47 PM, right when they're doing the piñata." "How do you know they do piñata at 2:47?" "I've been watching this family for three weeks, Bill. They have a schedule tighter than Fort Knox." She pulled out her phone, showing him a spreadsheet color-coded by activity. "2:45, cake time. 2:50, piñata chaos begins. That's our window." Bill nodded appreciatively at her meticulous planning, then frowned. "What about the parents?" "All distracted. Thirty of them minimum, all hovering around twenty screaming children wielding blindfolds and wooden sticks. It's basically a war zone disguised as entertainment." Molly stood and walked to the window. "Security cameras stop recording on Thursdays for maintenance. The present table has a blind spot behind the cake decorations. We come in carrying empty gift bags, leave with full ones." "That's insane," Bill said. "That's absolutely insane." "Is it?" Molly turned to face him. "Nobody suspects two silver-haired people at a children's party. We could be grandparents. We could be aunts and uncles." "I haven't been an uncle since 1994." "Exactly. You've got the look." She returned to the table. "Security guard is new, inexperienced, probably more interested in the clown than watching anything. The parents will assume we're friends of the family because we'll act like it. Confidence is ninety percent of any job." Bill studied the blueprints again, his weathered fingers tracing possible paths. Twenty years of pulling jobs together had taught him to trust Molly's instincts, even when those instincts led to stealing from eight-year-olds. "What about getaway?" "Van parked around back. We're gone by 3:15, presents in a storage unit by 3:45. We'll have them fenced by Friday morning." She sat back down. "So what do you say? You in?" Bill picked up his coffee and took a long sip, considering the most audacious theft of his criminal career. Stealing from a child's birthday party was probably where he'd finally cross some moral threshold, but the money was good and his daughter needed that root canal. "Yeah," he said finally. "I'm in."
created_at
"2026-02-08T15:49:06.674966+00:00"
Sunday, February 8th, 2026 at 3:49:06 PM GMT+00:00
model
"anthropic/claude-4.5-haiku-20251001"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
590
moderation_latency
205
generation_time
8012
tokens_prompt
61
tokens_completion
653
native_tokens_prompt
75
native_tokens_completion
735
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.00375
router
(null)
provider_responses
0
id
"32379f9a-bc8c-4a4d-bccc-7937e9db9d0d"
status
200
is_byok
false
latency
590
endpoint_id
"d000261c-5aeb-47d1-ac87-ad64dbe1cdb6"
provider_name
"Amazon Bedrock"
model_permaslug
"anthropic/claude-4.5-haiku-20251001"
api_type
"completions"
id
"gen-1770565737-b7iMKrfz4Fy337vzrmX6"
upstream_id
"32379f9a-bc8c-4a4d-bccc-7937e9db9d0d"
total_cost
0.00375
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
Evaluation details
Result
Evaluator
Details
Meta Data
81.2727%
Matches word count
n/a
neededClean
false
words
488
0.0000%
Dialogue to Total Word Ratio
Ratio: 53.69%, Deviation: 23.69%
neededClean
false
wordsTotal
488
wordsDialogue
262
40.6364%