NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-06-01-01KT1HFW4VV8907CDHK3JPQ55T
Run Details
Model:
Claude Opus 4.8 (Reasoning, Low)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
22.2s
Cost:
$0.029190
Provider:
Anthropic
Evaluation summary
Score
Evaluator
Details Link
99.98%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
49.9920%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the blueprint across the rickety table, smoothing its wrinkled corners with the reverence of a surgeon preparing for a transplant. Bill leaned in, squinting at the crayon-drawn diagram she'd somehow acquired. "This is a kid's drawing of a house," he said flatly. "It's reconnaissance," Molly corrected, tapping a stick-figure family rendered in purple. "Junior intel. The Hendersons' boy made it for show-and-tell. Note the bouncy castle here, the present table here." Bill rubbed his temples. Three years of armored truck jobs, museum infiltrations, and one memorable yacht heist, and now they'd descended into this. He looked at the loot board Molly had assembled, which featured glossy magazine cutouts of action figures, art supplies, and a suspiciously detailed photograph of a remote-control helicopter. "We're robbing a seven-year-old," he said. "We're robbing a seven-year-old's *party*," Molly said, as though this distinction carried tremendous legal and moral weight. "Twenty kids. Twenty presents. That's a clean inventory, Bill. No fences, no laundering. We just walk out with a garbage bag full of unopened toys." "And the parents?" "Distracted. Birthday parties are chaos incarnate. Cake, crying, someone always vomits on the carpet. Nobody's watching the present table during the magician act." Bill perked up, despite himself. "There's a magician?" "Marvelous Marvin. Two o'clock slot. Forty-five minutes of misdirection, professionally provided, free of charge." Molly grinned, clearly delighted by the elegance of it. "He pulls a rabbit out of a hat, we pull a Nintendo Switch out of the gift pile." Bill stood and paced the cramped apartment, stepping over a pizza box he'd been meaning to throw away since approximately the Obama administration. Something about this felt deeply wrong, and not in the satisfying way their work usually felt wrong. "What's our exit?" "The clown." He stopped pacing. "The clown." "We come dressed as entertainers. Nobody questions a clown carrying a big colorful sack. We waddle right past security—" "There's no security at a birthday party, Molly." "—right past *Grandma*," she amended smoothly, "and out the side gate. I've got a clown costume in the closet. You're the balloon-animal guy." Bill stared at her for a long, weighty moment. The refrigerator hummed. Somewhere outside, a car alarm wailed and died. "I have a code," he finally said. "There are lines. We've never crossed them. We don't hit churches, we don't hit small businesses, and we absolutely do not steal a child's birthday presents while disguised as a clown." Molly folded her arms, unimpressed. "The take is two thousand in retail value, minimum. There's a kid getting a *drone*, Bill. A drone." A pause. A long, telling pause. "What color drone?" "Does it matter?" "It matters to me." She smiled, sensing victory the way sharks sense a swimmer with a paper cut. She slid the crayon blueprint toward him, and Bill, against every fragment of his criminal dignity, picked up the red marker and began circling the present table. "I'm not wearing the clown shoes," he muttered. "You'll wear the shoes."
created_at
"2026-06-01T12:10:44.767Z"
6/1/2026, 12:10:44 PM
model
"anthropic/claude-4.8-opus-20260528"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1289
moderation_latency
192
generation_time
22006
tokens_prompt
77
tokens_completion
832
native_tokens_prompt
93
native_tokens_completion
1149
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.02919
router
(null)
provider_responses
0
endpoint_id
"b0d3beaa-5ff7-4666-a749-b1bb2b14c916"
id
"msg_01HkpYFh7Gudcp21sLMkpPME"
is_byok
false
latency
1287
model_permaslug
"anthropic/claude-4.8-opus-20260528"
provider_name
"Anthropic"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.16.0; linux; x64))"
http_referer
(null)
request_id
"req-1780315844-0Q8NUSVqjmGMZKBmXmIA"
session_id
(null)
preset_id
(null)
api_type
"completions"
id
"gen-1780315844-HKpuuZUSw9iM2VELgRuh"
upstream_id
"msg_01HkpYFh7Gudcp21sLMkpPME"
total_cost
0.02919
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Anthropic"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
99.98%
Matches word count
n/a
neededClean
false
words
498
0.00%
Dialogue to Total Word Ratio
Ratio: 47.74%, Deviation: 17.74%
neededClean
false
wordsTotal
509
wordsDialogue
243
49.9920%