NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-06-01-01KT1GBR01EKQ98NAZKKVTT8NR
Run Details
Model:
Claude Opus 4.8 (Reasoning, Low)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
18.9s
Cost:
$0.027690
Provider:
Anthropic
Evaluation summary
Score
Evaluator
Details Link
86.38%
Matches word count
(details)
2.45%
Dialogue to Total Word Ratio
(details)
44.4129%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the crumpled napkin across the table, jabbing a finger at her hastily drawn map. "Okay, the party starts at two. We hit it at two-fifteen, when the sugar high peaks." Bill squinted at the napkin. "Why do we wait for the sugar high?" "Because, genius, that's when the kids stop watching their presents and start vibrating off the walls. Total chaos. Nobody notices two adults walking out with a sack of LEGO." "I don't like LEGO. It hurts my feet." "We're not playing with it, Bill. We're stealing it. Then we're selling it." He nodded slowly, then frowned. "What about the parents?" "Distracted. There's a clown." "You hired a clown?" "I am the clown." Molly pulled a red nose out of her pocket and squeezed it. It honked sadly. Bill stared. "You're going as a clown." "I'm going as a clown so I can get inside, scope the present pile, and signal you through the window." "What's the signal?" "I'll juggle." "You can't juggle." "Exactly. They'll be so confused, they won't see you crawling through the back door." Bill rubbed his temples. "Molly, I have a bad feeling about this. Last time we robbed a place with kids, you cried." "That was a petting zoo, and the goat looked at me funny." "You sobbed into a llama." "It was an emotional moment, Bill, and you swore you'd never bring it up." He raised both hands. "Fine, fine. So I sneak in the back, grab the presents—" "All of them. Every single one. Don't be sentimental. If a kid is mid-unwrapping, you take it." "From the kid's hands?" "From the kid's hands." Bill looked physically ill. "That's monstrous." "That's the business, sweetheart. Now, the loot goes in the van. I'll meet you there once I've finished my balloon animals." "You can make balloon animals?" "No. But I can make balloon shapes that vaguely suggest trauma, and that buys us time." Bill leaned back, arms crossed. "And if it goes wrong?" "It won't go wrong." "It always goes wrong." "Name one time." "The bank." "Besides the bank." "The museum." "Besides the museum." "The other bank." Molly slammed her palm on the table. "Those were learning experiences, Bill! This is foolproof. It's a children's party. The strongest security they've got is a babysitter named Brittany and a piñata." "Don't underestimate piñatas. They're full of surprises." "They're full of candy." "And sometimes bees. I read about it." Molly pinched the bridge of her nose. "Bill. Look at me. We walk in, we juggle, we grab, we go. We'll be rich by three o'clock and eating tacos by four." Bill considered this. A slow smile crept across his face. "Tacos?" "All you can eat." He extended a hand. "Deal." She shook it, then plopped the red nose onto his face. It honked. Bill blinked. "Why am I wearing the nose?" "Because, partner," Molly grinned, "you're the clown now. I'm allergic to children."
created_at
"2026-06-01T11:51:00.871Z"
6/1/2026, 11:51:00 AM
model
"anthropic/claude-4.8-opus-20260528"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1067
moderation_latency
(null)
generation_time
18807
tokens_prompt
77
tokens_completion
787
native_tokens_prompt
93
native_tokens_completion
1089
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.02769
router
(null)
provider_responses
0
endpoint_id
"b0d3beaa-5ff7-4666-a749-b1bb2b14c916"
id
"msg_01XfeuV4dFVKVozsZPAvLA9V"
is_byok
false
latency
1067
model_permaslug
"anthropic/claude-4.8-opus-20260528"
provider_name
"Anthropic"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.16.0; linux; x64))"
http_referer
(null)
request_id
"req-1780314660-iiq3XNIma6reMbRWphdg"
session_id
(null)
preset_id
(null)
api_type
"completions"
id
"gen-1780314660-yLo6Ors4BO8RUaKDRCd4"
upstream_id
"msg_01XfeuV4dFVKVozsZPAvLA9V"
total_cost
0.02769
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Anthropic"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
86.38%
Matches word count
n/a
neededClean
false
words
489
2.45%
Dialogue to Total Word Ratio
Ratio: 78.62%, Deviation: 8.62%
neededClean
false
wordsTotal
491
wordsDialogue
386
44.4129%