NC Bench
Overview
About
Tests
Models
Model
Claude Haiku 4.5
Test
Dialogue tags
Scenario
Write 500 words with 70% dialogue
Run ID
2026-02-08-01KGYYR9VDDMJXCRBE31B130HA
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
# The Birthday Caper Molly tapped her pen against the clipboard, studying the layout of the gymnasium. Bill slouched in the folding chair beside her, nibbling on a breadstick he'd stolen from the refreshment table. "So here's what I'm thinking," Molly said. "We go in through the east entrance during the piñata chaos." "The piñata?" Bill looked up, crumbs on his chin. "That thing is barely hanging on a string. It's not chaos, it's just sad." "Exactly. Everyone will be distracted watching little Timmy swing a bat blindly around the room. That's when we move." Bill leaned back and crossed his arms. "Move where? The presents are stacked right next to the balloon arch in front of the entire crowd. It's practically a spotlight situation." "Not if we create a distraction," Molly said, tapping her pen more aggressively. "What if someone 'accidentally' knocks over the punch bowl?" "You want me to knock over the punch bowl?" Bill's eyebrows shot up. "That's your master plan? I thought we were professionals." "We're professionals who need to steal twenty-three wrapped boxes from a children's birthday party in suburban Ohio. Let's not get too lofty with our expectations here, Bill." Bill stood up and paced, gesturing wildly. "Okay, okay. What if we're not obvious about it? What if one of us mingles? Someone goes in, acts natural, becomes part of the festivities?" "You want to join the party?" Molly stared at him. "Why not? I could help with games. I could be charming. Kids like me." "Bill, you told a six-year-old last week that his drawing looked like a crime scene photograph." "He asked for honesty!" Bill threw his hands up. "Besides, that's exactly why it would work. No one would suspect the weird guy who's weird around kids." Molly sighed deeply. "That might actually be your dumbest idea yet, and you once suggested we rob a bank using mime tactics." "The mime thing had potential," Bill muttered, sitting back down. "It absolutely did not." Molly stood and walked to the gymnasium doorway, peering inside. She could see children running in circles, a clown attempting balloon animals, and in the corner, an unattended gift table. "Listen, here's what we do. You distract the clown. Get him to make you something ridiculous. Something that requires his full attention and several balloons." "A sword?" "Sure, a sword. Meanwhile, I create a medical emergency. Nothing serious. I pretend to faint." Bill's face lit up. "Brilliant. Parents will swarm you. Everyone will panic. Security will converge on you." "Exactly. And you sweep in, grab the presents, and get them to the van." "This might actually work," Bill said, standing and offering his hand to shake. Molly looked at his extended palm, then back at his face. "We're really doing this." "We're really doing this," Bill confirmed. They shook hands firmly, both grinning at the absurdity of their situation.
created_at
"2026-02-08T15:43:41.609691+00:00"
Sunday, February 8th, 2026 at 3:43:41 PM GMT+00:00
model
"anthropic/claude-4.5-haiku-20251001"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
472
moderation_latency
212
generation_time
8193
tokens_prompt
61
tokens_completion
644
native_tokens_prompt
75
native_tokens_completion
738
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.003765
router
(null)
provider_responses
0
id
"77ccd863-4c96-4cdb-bb96-d3ec69f85032"
status
200
is_byok
false
latency
472
endpoint_id
"d000261c-5aeb-47d1-ac87-ad64dbe1cdb6"
provider_name
"Amazon Bedrock"
model_permaslug
"anthropic/claude-4.5-haiku-20251001"
api_type
"completions"
id
"gen-1770565412-AuPHISSb0BwVygvwVw6V"
upstream_id
"77ccd863-4c96-4cdb-bb96-d3ec69f85032"
total_cost
0.003765
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
Evaluation details
Result
Evaluator
Details
Meta Data
43.3783%
Matches word count
n/a
neededClean
false
words
483
83.7375%
Dialogue to Total Word Ratio
Ratio: 68.12%, Deviation: 1.88%
neededClean
false
wordsTotal
483
wordsDialogue
329
63.5579%