NC Bench
Overview
About
Tests
Models
Model
Claude Opus 4.5
Test
Dialogue tags
Scenario
Write 500 words with 50% dialogue
Run ID
2026-02-08-01KGYYJ8Z8CGFY75T5RH6B61HA
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The van smelled like stale coffee and broken dreams, which Molly thought was appropriate given their current career trajectory. Bill unfolded the hand-drawn map across the dashboard, squinting at his own handwriting. The crayon had seemed like a good idea at the time, but now the purple lines were bleeding into the green ones. "Okay, so the target is located at 445 Maple Street," he said, tapping the paper with authority. "The Patterson residence. Saturday at two pm, we make our move." Molly stared at him. "You used crayon." "It was all I had. Focus, Molly. This is the big one." She grabbed the map and turned it right-side up. A small juice box stain covered what appeared to be the escape route. This was not filling her with confidence, but then again, confidence had abandoned her somewhere around their third failed job. "Walk me through it again. And please tell me you've accounted for the security this time." Bill grinned, showing the gap where his front tooth used to be, a casualty of the infamous hamster wheel incident. "Security is minimal. One mother, probably exhausted. One father on grill duty. And approximately fifteen six-year-olds hopped up on sugar." "The children are the security, Bill. You understand that, right? They see everything. They have no filter. One kid spots us and suddenly we've got a tiny army screaming stranger danger." He waved dismissively. "That's why we go in disguise. I've secured two costumes from a very legitimate source." Molly watched him reach into the back seat and produce a violently pink unicorn outfit and what appeared to be a deflated dinosaur costume covered in suspicious stains. "I'm not wearing either of those." "The unicorn is actually quite breathable. I tested it." "Why were you testing the unicorn costume, Bill?" "That's not important right now. What's important is the presents. According to my reconnaissance, little Timmy Patterson is turning seven, and his parents are loaded. We're talking high-end toys, gift cards, maybe even cash from the grandparents." She rubbed her temples, feeling a headache forming. This was supposed to be a temporary gig, a stepping stone to bigger and better heists. Yet here she was, plotting to rob a child's birthday party with a man who did reconnaissance on second graders. "The extraction point is the back gate," Bill continued, oblivious to her existential crisis. "We load the presents into these bags, slip out during the chaos of the piƱata, and disappear before anyone notices." "And if someone notices?" "Then we improvise. We're professionals." Molly looked at the crayon map, the stained dinosaur costume, and her partner's enthusiastic face. Professional was not the word she would have chosen. Desperate, perhaps. Ridiculous, certainly. But they were in too deep now, and the rent was due next week. She grabbed the unicorn costume and sighed heavily. "Fine. But I want the horn facing forward this time."
created_at
"2026-02-08T15:40:33.984146+00:00"
Sunday, February 8th, 2026 at 3:40:33 PM GMT+00:00
model
"anthropic/claude-4.5-opus-20251124"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1490
moderation_latency
149
generation_time
17943
tokens_prompt
61
tokens_completion
617
native_tokens_prompt
75
native_tokens_completion
702
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.017925
router
(null)
provider_responses
0
id
"7c85183d-3ece-4c12-9dc5-9ead0a3eae84"
status
200
is_byok
false
latency
1490
endpoint_id
"de32581c-2e11-4f54-8d29-e8a6e19d3038"
provider_name
"Amazon Bedrock"
model_permaslug
"anthropic/claude-4.5-opus-20251124"
api_type
"completions"
id
"gen-1770565215-D5CWWvksNuUMJ1VWZnII"
upstream_id
"7c85183d-3ece-4c12-9dc5-9ead0a3eae84"
total_cost
0.017925
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
Evaluation details
Result
Evaluator
Details
Meta Data
81.2727%
Matches word count
n/a
neededClean
false
words
488
54.5106%
Dialogue to Total Word Ratio
Ratio: 46.52%, Deviation: 3.48%
neededClean
false
wordsTotal
488
wordsDialogue
227
67.8916%