NC Bench
Overview
About
Tests
Models
Model
Claude Opus 4.5
Test
Dialogue tags
Scenario
Write 500 words with 50% dialogue
Run ID
2026-02-08-01KGZ071PZXFVD8DEQD2GH2RVQ
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The van smelled like stale cheese puffs and broken dreams. Molly spread the hand-drawn map across the dashboard, squinting at her own handwriting. "So the target is a six-year-old's birthday party," she said, tapping the crude rectangle she'd labeled BOUNCE HOUSE. "Security looks minimal." Bill adjusted his binoculars, peering through the windshield at the suburban house festooned with pink streamers. A balloon arch swayed menacingly in the breeze. "I don't know about minimal," he replied. "I count at least fifteen kids in that backyard. Maybe twenty." "Children aren't security, Bill. Children are chaos agents. They work in our favor." "Last time you said that, I got bit." Molly waved her hand dismissively. She had planned heists across three continents, and she refused to be intimidated by a gathering of first-graders hopped up on sugar. "The presents are being stored in the garage," she explained. "I saw the mother carrying them in earlier. We're talking serious haul here. These are wealthy parents with wealthy friends." "What kind of presents are we expecting exactly?" "The good stuff. Electronics. Gift cards. Maybe some high-end toys we can flip online." Bill lowered his binoculars and turned to face her with the expression of a man reconsidering his life choices. "We're stealing from children, Molly. Actual children." "We're redistributing wealth from tiny capitalists who haven't earned anything," she corrected. "Besides, that kid probably has too many toys already. We're teaching her about loss. It's character building." "That's the worst justification I've ever heard." "You said the same thing about the nursing home job, and that went fine." "We got chased by a woman on a mobility scooter for six blocks!" Molly shrugged and returned her attention to the map. The garage had a side door that faced away from the party. If they timed it right, they could slip in during the cake ceremony when all eyes would be on the birthday girl. "Entry point is here," she said. "You'll create a distraction while I handle extraction." "What kind of distraction works on children?" "I brought supplies. There's a bag of water balloons in the back." Bill's face cycled through several emotions before settling on reluctant acceptance. He had known Molly for twelve years, and arguing with her was like arguing with weather. "Fine. But I want seventy percent this time." "You'll get sixty and like it." "Sixty-five or I walk." Molly considered this. Outside, a child screamed with joy as the bounce house reached maximum chaos capacity. A parent was already opening wine. "Sixty-two," she offered. "Final answer." "Deal." They shook hands with the solemnity of criminals about to do something profoundly stupid. Molly pulled two ski masks from the glove compartment and handed one to Bill. "We move in ten minutes," she announced. "Right when they bring out the cake." "I hate everything about this." "That's the spirit."
created_at
"2026-02-08T16:09:22.113751+00:00"
Sunday, February 8th, 2026 at 4:09:22 PM GMT+00:00
model
"anthropic/claude-4.5-opus-20251124"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1761
moderation_latency
151
generation_time
16788
tokens_prompt
61
tokens_completion
622
native_tokens_prompt
75
native_tokens_completion
726
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.018525
router
(null)
provider_responses
0
id
"0a279703-621c-4f21-be6b-f4b7ab2f97b9"
status
200
is_byok
false
latency
1761
endpoint_id
"de32581c-2e11-4f54-8d29-e8a6e19d3038"
provider_name
"Amazon Bedrock"
model_permaslug
"anthropic/claude-4.5-opus-20251124"
api_type
"completions"
id
"gen-1770566944-F9f8DxsszAheArrGk5Z9"
upstream_id
"0a279703-621c-4f21-be6b-f4b7ab2f97b9"
total_cost
0.018525
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Amazon Bedrock"
Evaluation details
Result
Evaluator
Details
Meta Data
35.0022%
Matches word count
n/a
neededClean
false
words
482
98.0817%
Dialogue to Total Word Ratio
Ratio: 50.62%, Deviation: 0.62%
neededClean
false
wordsTotal
482
wordsDialogue
244
66.5419%