NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-06-01-01KT1HFH88K6PCYH5YQ6M58CQ4
Run Details
Model:
Claude Opus 4.8 (Reasoning, Low)
Test:
Dialogue tags
Scenario:
Write 200 words with 10% dialogue
Duration:
10.3s
Cost:
$0.014340
Provider:
Anthropic
Evaluation summary
Score
Evaluator
Details Link
99.98%
Matches word count
(details)
98.04%
Dialogue to Total Word Ratio
(details)
99.0129%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 10% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly crouched behind the hydrangea bushes, her binoculars trained on the suburban fortress disguised as a birthday party. Streamers sagged from the gutters, a bouncy castle wheezed in the corner, and a clown juggled bowling pins with menacing precision. The present table groaned beneath a mountain of glittering boxes, each one wrapped with the kind of optimism only a seven-year-old could inspire. Bill adjusted his fake mustache, which kept sliding sideways in the heat, and consulted a hand-drawn map smudged with what she hoped was chocolate. "The clown's the problem," he muttered. She nodded grimly. Clowns were always the problem. According to their intelligence, the cake would be served at three o'clock sharp, providing a forty-second window of sugar-induced chaos during which every adult would be distracted by candles, singing, and the inevitable crying of the child who didn't get the corner piece. That was their moment. Molly would create a diversion involving a remote-controlled spider, Bill would commandeer a wheelbarrow disguised as a piƱata cart, and together they would liberate every last gift before anyone noticed. "For the record," Bill whispered, repositioning his mustache one final time, "stealing from children feels morally questionable." Molly shrugged. Crime, after all, recognized no birthday.
created_at
"2026-06-01T12:10:33.614Z"
6/1/2026, 12:10:33 PM
model
"anthropic/claude-4.8-opus-20260528"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
3289
moderation_latency
(null)
generation_time
10255
tokens_prompt
77
tokens_completion
410
native_tokens_prompt
93
native_tokens_completion
555
native_tokens_completion_images
(null)
native_tokens_reasoning
66
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.01434
router
(null)
provider_responses
0
endpoint_id
"b0d3beaa-5ff7-4666-a749-b1bb2b14c916"
id
"msg_01TLBb98oENkJsZovoehzDBm"
is_byok
false
latency
1127
model_permaslug
"anthropic/claude-4.8-opus-20260528"
provider_name
"Anthropic"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.16.0; linux; x64))"
http_referer
(null)
request_id
"req-1780315833-em8bDSGyu4Jke3k1CtsK"
session_id
(null)
preset_id
(null)
api_type
"completions"
id
"gen-1780315833-nSQ98kzckg5EAlhfPWIw"
upstream_id
"msg_01TLBb98oENkJsZovoehzDBm"
total_cost
0.01434
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Anthropic"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
99.98%
Matches word count
n/a
neededClean
false
words
202
98.04%
Dialogue to Total Word Ratio
Ratio: 6.25%, Deviation: 3.75%
neededClean
false
wordsTotal
208
wordsDialogue
13
99.0129%