NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-29-01KSSJSZFX504180J7JFQCSH6G
Run Details
Model:
Claude Opus 4.8 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 200 words with 90% dialogue
Duration:
9.4s
Cost:
$0.013465
Provider:
Anthropic
Evaluation summary
Score
Evaluator
Details Link
99.92%
Matches word count
(details)
93.59%
Dialogue to Total Word Ratio
(details)
96.7531%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
"So we're really doing this?" Bill asked. "Robbing a kid's birthday party?" "Keep your voice down," Molly hissed. "And yes. Think about it. Hundreds of presents, all wrapped, all unguarded." "Unguarded? There'll be like forty children there." "Children, Bill. We're grown adults. What are they going to do, throw juice boxes at us?" "My cousin got concussed by a juice box once." "Your cousin gets concussed by everything. Focus. We go in dressed as the entertainment." "What entertainment?" "Clowns, Bill. We go in as clowns." "Absolutely not. I have a fear of clowns." "You're going to *be* a clown." "That doesn't help! That makes it worse!" "Fine," Molly groaned. "You can be the magician." "Magicians steal things all the time. That's perfect cover." "See? Now you're thinking." "And while I do card tricks, you load the presents into the van?" "Exactly. We're in, we're out, we're rich." "Rich off Hot Wheels and stuffed animals." "Those stuffed animals retail for thirty dollars each, Bill." "You've researched this." "I've researched everything. Pin the Tail on the Donkey starts at two. That's our window." "Why then?" "Blindfolds, Bill. Everyone's wearing blindfolds." Bill paused. "Molly, that's genuinely brilliant." "I know," she grinned. "Now grab the fake mustache."
created_at
"2026-05-29T09:59:48.997Z"
5/29/2026, 9:59:48 AM
model
"anthropic/claude-4.8-opus-20260528"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
800
moderation_latency
(null)
generation_time
9104
tokens_prompt
77
tokens_completion
361
native_tokens_prompt
93
native_tokens_completion
520
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.013465
router
(null)
provider_responses
0
endpoint_id
"b0d3beaa-5ff7-4666-a749-b1bb2b14c916"
is_byok
false
latency
274
model_permaslug
"anthropic/claude-4.8-opus-20260528"
provider_name
"Anthropic"
status
429
1
endpoint_id
"dfc0e5bd-d703-4fe2-a7bb-655eb95d5441"
id
"msg_01W8tJNn2iQynj8pipWSABHz"
is_byok
false
latency
800
model_permaslug
"anthropic/claude-4.8-opus-20260528"
provider_name
"Anthropic"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.16.0; linux; x64))"
http_referer
(null)
request_id
"req-1780048789-BwAypqmRNML0icpjBl9V"
session_id
(null)
api_type
"completions"
id
"gen-1780048789-jfgXIiNjc6s4mT8Jf3Jp"
upstream_id
"msg_01W8tJNn2iQynj8pipWSABHz"
total_cost
0.013465
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Anthropic"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
99.92%
Matches word count
n/a
neededClean
false
words
203
93.59%
Dialogue to Total Word Ratio
Ratio: 95.07%, Deviation: 5.07%
neededClean
false
wordsTotal
203
wordsDialogue
193
96.7531%