NC Bench
Overview
About
Tests
Models
Model
Claude Haiku 4.5
Test
Dialogue tags
Scenario
Write 500 words with 70% dialogue
Run ID
2026-02-08-01KGYWY4Y8E1RR9TB8QZA2RPGK
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
# The Birthday Job Molly spread out the blueprints across the kitchen table while Bill made sandwiches. The Anderson residence sat three blocks from the interstate, perfect for a quick getaway. "So we're really doing this?" Bill asked, handing her a turkey and swiss. "The intel is solid. Thirty presents minimum, all wrapped and stacked in the garage. We hit it Saturday during the party." "While the kids are there?" "Exactly. Maximum chaos equals maximum opportunity. They won't expect a professional operation during a six-year-old's birthday party." Bill whistled low. "This has to be our most ridiculous job yet." "Remember the bank in Tulsa?" Molly smirked. "We stole a inflatable raft by accident." "That was YOUR fault. You said 'roof access,' I thought you meant—never mind." Bill pointed at the blueprint. "How do we get in without being seen?" "The side entrance near the driveway. Mrs. Anderson's hiring a bounce house company Saturday morning. We go in their truck, dressed as handlers." "Won't they notice we're not actually with the company?" "The owner's brother is handling that for us. Fifty bucks and a promise we won't get caught. He seems optimistic." Bill laughed. "And the exit?" "The gifts get loaded into our van during the chaos. By the time Mrs. Anderson does inventory, we're three states away selling Lego sets on the dark web." "You really thought this through," Bill said, studying the plans. "But what if someone recognizes us? What if there are cameras?" "Their system is five years old, passwords are their dog's name. Junior. I checked their Facebook." "Of course you did." Bill bit into his sandwich. "What's our escape route if something goes wrong?" "Through the backyard, over the fence, and into the woods. I scoped it yesterday. Fifteen-minute walk to the rendezvous point." "What could possibly go wrong with a plan this solid?" Bill asked sarcastically. "Everything," Molly admitted. "But that's why we get paid. I've stolen from fortune five hundreds, military bases, and a duchess with actual diamonds. A birthday party is nothing." "A duchess?" Bill raised his eyebrows. "You never mentioned that." "You never asked." Molly took another bite. "Look, if you're not comfortable—" "No, no. I'm in. I just want to know: what are we doing with thirty children's toys?" "Reselling. There's a market. New releases, limited editions. Some of those Lego sets go for triple the retail price." Bill shook his head, laughing. "We're stealing from children." "Stealing FROM children would be mugging them. We're stealing AT a children's party. There's a moral distinction." "There absolutely is not." "Close enough for government work," Molly said, gathering the blueprints. "Saturday. Six p.m. Wear dark clothing and comfortable shoes." Bill raised his sandwich in a mock toast. "Here's to the most shameful heist of my career." "Make it classy, and no one will question it," Molly replied.
created_at
"2026-02-08T15:11:57.116319+00:00"
Sunday, February 8th, 2026 at 3:11:57 PM GMT+00:00
model
"anthropic/claude-4.5-haiku-20251001"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
519
moderation_latency
(null)
generation_time
9262
tokens_prompt
61
tokens_completion
647
native_tokens_prompt
75
native_tokens_completion
740
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.003775
router
(null)
provider_responses
0
id
"msg_0114pSEYSzymQv7wxFNLDCTK"
status
200
is_byok
false
latency
519
endpoint_id
"41d2915a-92e6-4993-b537-210b4e10cba8"
provider_name
"Anthropic"
model_permaslug
"anthropic/claude-4.5-haiku-20251001"
api_type
"completions"
id
"gen-1770563507-Kllt3X8G673Z2kg4XiVv"
upstream_id
"msg_0114pSEYSzymQv7wxFNLDCTK"
total_cost
0.003775
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Anthropic"
Evaluation details
Result
Evaluator
Details
Meta Data
9.6081%
Matches word count
n/a
neededClean
false
words
478
0.4794%
Dialogue to Total Word Ratio
Ratio: 80.33%, Deviation: 10.33%
neededClean
false
wordsTotal
478
wordsDialogue
384
5.0438%