NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-29-01KSSJSPAVPJKAT2MR38FYAYCW
Run Details
Model:
Claude Opus 4.8 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
20.0s
Cost:
$0.027415
Provider:
Anthropic
Evaluation summary
Score
Evaluator
Details Link
100.00%
Matches word count
(details)
0.31%
Dialogue to Total Word Ratio
(details)
50.1533%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The blueprints were spread across the kitchen table, weighted down at the corners by two cans of beans and a half-eaten sandwich that Bill had abandoned three hours ago. Molly tapped the paper with a chewed pencil. "This is the perimeter," she said. "Bouncy castle on the left, snack table on the right, and the present pile is dead center." Bill leaned in, squinting at the crude drawing. He held a magnifying glass, though he had no idea why. It made him feel professional. "What's this squiggle?" "That's a clown." "A clown." He set the magnifying glass down very slowly. "Molly, you never mentioned a clown." She rolled her eyes and pulled a folder from beneath the blueprints. Inside were surveillance photos, slightly blurry, taken from a parked van across the street. A man in oversized shoes loomed in every frame, balloons clutched in his fist like hostages. "His name is Mr. Giggles," she said. "He performs Tuesdays through Saturdays. Birthday parties, corporate events, the occasional funeral." "Funerals?" "People grieve differently." Bill pushed back from the table and began pacing the small kitchen, three steps one way, three steps back. He always paced when the stakes climbed. It was a habit left over from his days robbing pet stores, a career path that had ended badly involving a parrot and a confession. "And the kids?" he asked. "How many are we talking?" Molly consulted her notes. "Twelve confirmed RSVPs. Ages four to seven. Highly unpredictable. Sticky hands. Some of them scream for no reason." "Twelve." Bill swallowed. "Twelve tiny witnesses." "They won't remember anything. Kids that age can't even tie their own shoes. Last week my nephew tried to eat a crayon and called it spaghetti." This did little to comfort him. He stopped pacing and stared out the window, watching a pigeon attack a discarded chip with savage enthusiasm. "The presents, though," Molly continued, her voice softening into something almost dreamy. "Imagine it, Bill. Brand new. Still wrapped. Nobody's even opened them. We could be looking at action figures, building sets, maybe a remote control car or two." "A remote control car," Bill repeated. He turned around. Something had shifted in his expression. "The good kind. The kind with the off-road tires." He returned to the table and picked up the magnifying glass again, this time with purpose. He examined the present pile in the photographs, his greed slowly overtaking his common sense, which had always been a fragile thing. "We go in during pin the tail on the donkey," Molly said. "Everyone's blindfolded. Maximum chaos, minimum eyewitnesses. We grab the haul and we're out before they finish singing happy birthday." "And Mr. Giggles?" "I'll handle the clown." Bill considered this. He considered the off-road tires. He considered that he was forty-three years old and planning a robbery against children whose primary defense was screaming. "Fine," he said finally, extending his hand. "But if that clown does a single magic trick, I'm gone." Molly shook it, grinning. "Deal."
created_at
"2026-05-29T09:59:39.618Z"
5/29/2026, 9:59:39 AM
model
"anthropic/claude-4.8-opus-20260528"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1354
moderation_latency
(null)
generation_time
19959
tokens_prompt
77
tokens_completion
807
native_tokens_prompt
93
native_tokens_completion
1078
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.027415
router
(null)
provider_responses
0
endpoint_id
"b0d3beaa-5ff7-4666-a749-b1bb2b14c916"
id
"msg_014D3jWtFAK9n2G8RF43nzX5"
is_byok
false
latency
1354
model_permaslug
"anthropic/claude-4.8-opus-20260528"
provider_name
"Anthropic"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.16.0; linux; x64))"
http_referer
(null)
request_id
"req-1780048779-eXxSbwvO62snfdhwO9KM"
session_id
(null)
api_type
"completions"
id
"gen-1780048779-gAhMR9dQTmj6Jw40hLHZ"
upstream_id
"msg_014D3jWtFAK9n2G8RF43nzX5"
total_cost
0.027415
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Anthropic"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Matches word count
n/a
neededClean
false
words
499
0.31%
Dialogue to Total Word Ratio
Ratio: 40.76%, Deviation: 10.76%
neededClean
false
wordsTotal
503
wordsDialogue
205
50.1533%