NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-31-01KSYXBYVMVJ6V2RH2W1EDEBY2
Run Details
Model:
Claude Opus 4.8 (Reasoning, High)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
20.3s
Cost:
$0.028890
Provider:
Anthropic
Evaluation summary
Score
Evaluator
Details Link
99.92%
Matches word count
(details)
2.08%
Dialogue to Total Word Ratio
(details)
51.0004%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the blueprints across the kitchen table, smoothing out the wrinkles with the reverence of a surgeon. "Okay, here's the layout," she said. "Bouncy castle here, snack table here, and the presents—" she stabbed the paper, "—right here, in the den." Bill leaned over her shoulder, chewing a toothpick. "Where'd you get the floor plan of a six-year-old's birthday party?" "I'm a professional, Bill." "You stole it from Zillow, didn't you?" "I stole it from Zillow." He nodded approvingly. "Solid tradecraft. So what's the play?" "We go in disguised as entertainment," Molly said. "I'll be the magician. You'll be the clown." Bill straightened up. "Absolutely not." "Bill." "I have a fear of clowns, Molly. You know this. It's deeply personal." "You *are* the clown. You can't be afraid of yourself." "That's exactly the kind of thing a clown would say before it eats you." Molly pinched the bridge of her nose. "Fine. I'll be the clown. You be the magician." "I don't know any magic." "You don't need to know any magic, you need to distract a room full of children long enough for me to load forty presents into a duffel bag." Bill considered this. "I could pull a quarter out of someone's ear." "Perfect." "What if I run out of ears?" "Bill, there will be twenty children. That's forty ears. That is a *surplus* of ears." He seemed satisfied. "And the getaway?" "Ice cream truck. Parked out front, engine running. Nobody questions an ice cream truck at a birthday party. It's camouflage." "That's actually brilliant." "I know." "One problem," Bill said. "The kids are gonna want ice cream." Molly froze. "What?" "It's an ice cream truck. At a kid's party. They're gonna swarm it like little sugar-crazed piranhas. We'll be selling Drumsticks while the real entertainment's happening inside." "...That's a good point." "I have those occasionally." "Okay, no ice cream truck. We take my Corolla." Bill scoffed. "Forty presents in a Corolla? You can barely fit your *ego* in that car." "My ego rides in the trunk, thank you." "I'm just saying, we need cargo space." "We'll make two trips." "Two trips!" Bill threw his hands up. "Molly, this is a heist, not an IKEA run. You don't make *two trips*." "Then what do you suggest, magic man?" Bill paused dramatically, then reached behind Molly's ear and produced a single car key. "I rented a van," he said, beaming. "Pulled it right out of your ear." Molly stared at him. "Where did you actually get that key?" "From my pocket. The ear thing was for flair." "It was good flair." "Thank you. I've been practicing." She rolled up the blueprints. "Alright. Tomorrow, two o'clock. Princess theme. We slip in, dazzle the children, and walk out rich." "And if it goes wrong?" Molly slung the bag over her shoulder. "Then we run very fast, in oversized shoes, away from a parent named Karen." Bill shuddered. "The real monster," he whispered, "was never the clown."
created_at
"2026-05-31T11:40:36.096Z"
5/31/2026, 11:40:36 AM
model
"anthropic/claude-4.8-opus-20260528"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1263
moderation_latency
(null)
generation_time
20291
tokens_prompt
77
tokens_completion
810
native_tokens_prompt
93
native_tokens_completion
1137
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.02889
router
(null)
provider_responses
0
endpoint_id
"b0d3beaa-5ff7-4666-a749-b1bb2b14c916"
id
"msg_01BwfLPaykznNJmD77x6BjFj"
is_byok
false
latency
1263
model_permaslug
"anthropic/claude-4.8-opus-20260528"
provider_name
"Anthropic"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.16.0; linux; x64))"
http_referer
(null)
request_id
"req-1780227636-gdNR2d6O0PQAymkTsrsg"
session_id
(null)
preset_id
(null)
api_type
"completions"
id
"gen-1780227636-aXLm4TiWWrxVoAEXu66A"
upstream_id
"msg_01BwfLPaykznNJmD77x6BjFj"
total_cost
0.02889
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Anthropic"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
99.92%
Matches word count
n/a
neededClean
false
words
497
2.08%
Dialogue to Total Word Ratio
Ratio: 78.80%, Deviation: 8.80%
neededClean
false
wordsTotal
500
wordsDialogue
394
51.0004%