NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-29-01KSSK2WC9HSPJTEA2GXJ8J7NP
Run Details
Model:
Claude Opus 4.8 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
19.6s
Cost:
$0.027240
Provider:
Anthropic
Evaluation summary
Score
Evaluator
Details Link
86.38%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
43.1902%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the blueprints across the kitchen table, smoothing out the wrinkles with the kind of reverence usually reserved for ancient treasure maps. In reality, it was a hand-drawn diagram of the Hendersons' backyard, complete with a poorly sketched bouncy castle and a stick figure labeled "CLOWN—DANGER." Bill leaned over her shoulder, chewing a toothpick with theatrical menace. He'd seen it in a movie once and had been doing it ever since, despite Molly's repeated complaints that it made him look like he was perpetually flossing. "The presents arrive at noon," she said, tapping the table. "By twelve-thirty, that gift table will be loaded. PlayStations, LEGO sets, possibly an iPad if little Timmy's grandparents are feeling generous." "And we just walk in and grab them?" "No, Bill. We blend in. We become part of the environment." He squinted at the diagram. "I'm forty-three years old. How exactly do I blend in at a six-year-old's birthday party?" Molly slid a folded paper bag across the table. He opened it, then immediately closed it. "Absolutely not." "It's a balloon-animal costume. You'll be the entertainment. Nobody questions a man making giraffes out of balloons." "I don't know how to make giraffes." "Then make snakes. A snake is just a balloon you didn't twist. Children have low standards, Bill. They eat glue." He grumbled but tucked the bag under his arm. "And what's your cover?" Molly produced a clipboard from thin air, the way she always did, which Bill found deeply unsettling. "I'm the party planner. Officious, important, holding a clipboard. The trifecta of invisibility. People assume anyone with a clipboard belongs." "That's actually genius." "I know. The plan is simple. While you distract the children with your tragic balloon snakes, I'll be loading presents into the catering van parked by the back gate." Bill frowned. "We have a catering van?" "We have a van. I wrote 'CATERING' on the side with a marker." "Did you spell it right this time?" There was a long, dangerous silence. "It's fine," Molly said. "Nobody reads. That's the beauty of this whole operation." She rolled up the blueprints with a flourish and stood, pacing the kitchen like a general before battle. Bill watched her, the toothpick drooping from his mouth. "There's one thing bothering me," he said. "Only one? You're growing." "What if the kids cry? When the presents disappear. I don't think I can handle the crying, Molly. I have a soul somewhere. I'm pretty sure I left it in a locker at the bus station, but it exists." Molly placed a comforting hand on his shoulder, her expression softening into something almost maternal. "Bill," she said gently, "they're rich kids. They'll get more presents tomorrow. Probably a pony. We're practically teaching them a valuable lesson about loss." He considered this. The toothpick rose again, hopeful. "You're right. We're basically educators." "Exactly. Now go practice your snakes. The future of crime depends on it."
created_at
"2026-05-29T10:04:40.723Z"
5/29/2026, 10:04:40 AM
model
"anthropic/claude-4.8-opus-20260528"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1405
moderation_latency
(null)
generation_time
19605
tokens_prompt
77
tokens_completion
795
native_tokens_prompt
93
native_tokens_completion
1071
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.02724
router
(null)
provider_responses
0
endpoint_id
"b0d3beaa-5ff7-4666-a749-b1bb2b14c916"
id
"msg_01FUbhj2pb1EbqFLySswS147"
is_byok
false
latency
1405
model_permaslug
"anthropic/claude-4.8-opus-20260528"
provider_name
"Anthropic"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.16.0; linux; x64))"
http_referer
(null)
request_id
"req-1780049080-KfoS6zsO1Q79SRMYFEsC"
session_id
(null)
api_type
"completions"
id
"gen-1780049080-eSpqZrD0pabOFyKim3HK"
upstream_id
"msg_01FUbhj2pb1EbqFLySswS147"
total_cost
0.02724
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Anthropic"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
86.38%
Matches word count
n/a
neededClean
false
words
489
0.00%
Dialogue to Total Word Ratio
Ratio: 61.82%, Deviation: 31.82%
neededClean
false
wordsTotal
495
wordsDialogue
306
43.1902%