NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-31-01KSYXJK54T5PHJSCSKB5X40YH
Run Details
Model:
Claude Opus 4.8 (Reasoning, High)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
19.9s
Cost:
$0.027590
Provider:
Anthropic
Evaluation summary
Score
Evaluator
Details Link
90.48%
Matches word count
(details)
6.46%
Dialogue to Total Word Ratio
(details)
48.4694%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the blueprint across the table, smoothing its crinkled edges with the reverence of a surgeon preparing for a transplant. The document in question was a child's crayon drawing of a house, complete with a smiling sun and a dog roughly the size of the chimney. "Where did you even get this?" Bill asked, squinting at the orange scribbles that apparently represented the perimeter fence. "Reconnaissance," Molly said proudly. "The Hendersons' kid left it taped to the mailbox. Practically gift-wrapped intelligence." Bill picked up the drawing and turned it sideways, then upside down. The dog now appeared to be flying. He set it back down with the careful disappointment of a man who had hoped for more from his criminal partnership. "Molly, this is a picture of a happy family. There's a rainbow." "That's the security grid." "It's a rainbow." She snatched the paper back, wounded. "You have to read between the crayons, Bill. The mission is simple. Eight-year-old's birthday party. Two o'clock Saturday. The garage will be stacked floor to ceiling with presents, all conveniently boxed and ribboned for easy transport." Bill leaned back in his chair, which groaned in protest. Twenty years in the business and it had come to this—plundering the loot of a child named Tyler. He rubbed his temples where a headache was forming, a small reasonable voice trying to escape through his forehead. "And the parents?" he asked. "Distracted. There's a clown." "A clown." "A magician clown," Molly clarified, as though this detail dramatically improved their odds. "Children become hypnotized. It's basic crowd control. We simply walk in dressed as caterers, locate the present pile, and vanish before anyone notices." The plan had a certain elegant stupidity to it, Bill had to admit. No alarms. No lasers. No vault. Just frosting and the gullibility of adults too tired from organizing a party to question two strangers carrying away the haul. "What's the take?" he said, against his better judgment. Molly's eyes gleamed. "Conservatively? Forty wrapped gifts. We're talking action figures. Possibly a bicycle. Last year's party reportedly featured a tablet computer." "You've been casing children's birthdays." "I've been investing in market research." Bill stood and walked to the window, hands clasped behind his back like a general surveying a battlefield made entirely of balloons. Somewhere out there, an innocent eight-year-old was anticipating the greatest day of his short life, blissfully unaware that fate, in the form of two middle-aged criminals, was coming for his loot bag. "This is the lowest we've ever sunk," he said quietly. "We robbed a casino dressed as nuns." "That had dignity." Molly joined him at the window, slipping a folded clown nose into his coat pocket like a benediction. "Think of it as a victimless crime. Children forget everything. By Tuesday he won't even remember he had a birthday." Bill stared at the red foam nose now resting in his palm. Against every principle he had ever held, he felt the old thrill return. "Fine," he sighed. "But I am not honking that." "You'll honk it beautifully."
created_at
"2026-05-31T11:44:13.488Z"
5/31/2026, 11:44:13 AM
model
"anthropic/claude-4.8-opus-20260528"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
833
moderation_latency
(null)
generation_time
19858
tokens_prompt
77
tokens_completion
828
native_tokens_prompt
93
native_tokens_completion
1085
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.02759
router
(null)
provider_responses
0
endpoint_id
"b0d3beaa-5ff7-4666-a749-b1bb2b14c916"
id
"msg_01PGCn1vKQscpGDnAxq5Zcmk"
is_byok
false
latency
833
model_permaslug
"anthropic/claude-4.8-opus-20260528"
provider_name
"Anthropic"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.16.0; linux; x64))"
http_referer
(null)
request_id
"req-1780227853-6FZ3P9CBO6586nfBfiBd"
session_id
(null)
preset_id
(null)
api_type
"completions"
id
"gen-1780227853-WjcEsCy5G4PioEWfkY1V"
upstream_id
"msg_01PGCn1vKQscpGDnAxq5Zcmk"
total_cost
0.02759
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Anthropic"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
90.48%
Matches word count
n/a
neededClean
false
words
510
6.46%
Dialogue to Total Word Ratio
Ratio: 37.40%, Deviation: 7.40%
neededClean
false
wordsTotal
516
wordsDialogue
193
48.4694%