NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-29-01KSSK8C6N3KMVG7PJ1ZD2NJJZ
Run Details
Model:
Claude Opus 4.8 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
22.3s
Cost:
$0.029890
Provider:
Anthropic
Evaluation summary
Score
Evaluator
Details Link
99.98%
Matches word count
(details)
7.01%
Dialogue to Total Word Ratio
(details)
53.4971%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly slapped the blueprint onto the table. "Behold. The Hendderson estate. Tommy Henderson turns seven on Saturday." Bill leaned in, squinting. "This is a napkin." "It's a tactical sketch." "You drew a bouncy castle with a smiley face." "That bouncy castle," Molly said, jabbing it, "is our point of entry. Nobody suspects a man crawling out of an inflatable." Bill sat back, arms crossed. "Molly. We're grown adults. We steal jewels. Cars. That one time, a yacht." "We sank that yacht." "We borrowed it temporarily before it sank." "Which is why," Molly said, "we need a low-risk job. Children's birthday party. Soft targets. No security. Just a clown and roughly forty unsupervised present bags." Bill rubbed his temples. "And what exactly are we stealing? Coloring books? A xylophone?" "You'd be amazed what kids get these days. Tablets. Gaming consoles. Little Dylan Mathers got a drone for his birthday. A drone, Bill." "How do you know Dylan Mathers?" "I attended his party." Bill stared. "You went to a child's party. Alone." "For reconnaissance," Molly said defensively. "I brought a gift. I'm not a monster." "What'd you bring?" "A xylophone." "So you bought one child a xylophone to scope out stealing another child's drone." "It was a sound investment in our criminal future." Bill groaned and pointed at the napkin. "Fine. Walk me through it. What's the plan?" "Simple. You're the entertainment. Big Bill the Birthday Bear." "Absolutely not." "The costume's in the car." "You already bought the costume?" "It was non-refundable," Molly said. "While you distract the children with juggling—" "I can't juggle." "—with attempting to juggle, I slip into the gift pile, scoop everything into a sack, and exfiltrate through the bouncy castle. We're gone before they cut the cake." Bill was quiet for a moment. "There's cake?" "Focus." "What kind of cake?" "I don't know, Bill, it's a seven-year-old, probably dinosaurs." "I love a dinosaur cake." "You can have cake after we successfully rob the children." Bill exhaled. "I just want it on record that we used to be respected. We were on a poster once. A wanted poster, but still." "And now we're aiming higher. Or smaller. We're aiming smaller." Molly grabbed his shoulder. "Look. One quick job. We grab the loot, sell it, and we're set for months. No yachts. No sinking. Just easy, breezy birthday burglary." "And if a parent catches us?" "Then you're a bear who lost his way. Bears can't be prosecuted." "That's not real law." "It's real enough." Molly extended her hand. "So. Are you in?" Bill looked at the napkin. At the smiley bouncy castle. At his own dwindling dignity. "If I do this," he said slowly, "I get the drone." "You don't even—" "I get the drone, Molly." She sighed and shook his hand. "Fine. You get the drone." "And a slice of dinosaur cake." "You're really negotiating hard for a robbery against a child." Bill stood, straightening an imaginary tie. "Big Bill the Birthday Bear has standards."
created_at
"2026-05-29T10:07:40.766Z"
5/29/2026, 10:07:40 AM
model
"anthropic/claude-4.8-opus-20260528"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1106
moderation_latency
204
generation_time
22020
tokens_prompt
77
tokens_completion
824
native_tokens_prompt
93
native_tokens_completion
1177
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.02989
router
(null)
provider_responses
0
endpoint_id
"b0d3beaa-5ff7-4666-a749-b1bb2b14c916"
id
"msg_01UUrh89DqP8hycZL1kZvpUU"
is_byok
false
latency
1106
model_permaslug
"anthropic/claude-4.8-opus-20260528"
provider_name
"Anthropic"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.16.0; linux; x64))"
http_referer
(null)
request_id
"req-1780049260-k8RTwDDOWkMAxa4UUPv7"
session_id
(null)
api_type
"completions"
id
"gen-1780049260-yexhZc4rnySon98UPkHc"
upstream_id
"msg_01UUrh89DqP8hycZL1kZvpUU"
total_cost
0.02989
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Anthropic"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
99.98%
Matches word count
n/a
neededClean
false
words
498
7.01%
Dialogue to Total Word Ratio
Ratio: 77.29%, Deviation: 7.29%
neededClean
false
wordsTotal
502
wordsDialogue
388
53.4971%