NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-31-01KSYXR4CSZZ3VFNG8J6H77SXY
Run Details
Model:
Claude Opus 4.8 (Reasoning, High)
Test:
Dialogue tags
Scenario:
Write unattributed dialogue
Duration:
9.3s
Cost:
$0.011210
Provider:
Anthropic
Evaluation summary
Score
Evaluator
Details Link
100.00%
Count dialogue tags
(details)
Model input
User/Human
You are an expert novel writer in the cozy mystery genre. Write a scene of about 200 words, with dialogue between two characters. The dialogue should be unattributed, meaning the speaker is not explicitly mentioned. The scene should be set in a small town bakery. Only return the story, nothing else.
Model response
Messages
Meta
AI
The bell above the door jingled as the morning light spilled across the flour-dusted counter. "You're here early. Couldn't sleep either?" "How could I? After what happened to poor Mr. Pemberton last night, the whole town's buzzing." "Buzzing is one word for it. I had three people in here before seven, all pretending they wanted croissants." "And what did they really want?" "To gossip, of course. Mabel swears she saw someone sneaking around behind the post office at midnight." "Mabel can barely see past her own nose without those glasses she refuses to wear." "True. Still, it makes you wonder. He was a careful man, wasn't he? Locked everything twice." "Which is exactly why it doesn't add up. Careful men don't fall down their own cellar stairs." "You don't think it was an accident?" "I think this cinnamon roll is getting cold while you fish for my opinion." "I'm serious. You knew him better than anyone." "I knew his order. Black coffee, two sugars, and a sourdough loaf every Thursday." "That's more than most people knew." "Maybe. But if you're asking whether I think someone in this town pushed him—" "I'm asking." "Then pour yourself a coffee and lock the door. This could take a while."
created_at
"2026-05-31T11:47:14.983Z"
5/31/2026, 11:47:14 AM
model
"anthropic/claude-4.8-opus-20260528"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1164
moderation_latency
(null)
generation_time
9222
tokens_prompt
81
tokens_completion
330
native_tokens_prompt
107
native_tokens_completion
427
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.01121
router
(null)
provider_responses
0
endpoint_id
"b0d3beaa-5ff7-4666-a749-b1bb2b14c916"
id
"msg_014TP2TWGMn2yCjz5QxZn2WG"
is_byok
false
latency
1164
model_permaslug
"anthropic/claude-4.8-opus-20260528"
provider_name
"Anthropic"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.16.0; linux; x64))"
http_referer
(null)
request_id
"req-1780228034-90UF6JCliCVhbTOGJguY"
session_id
(null)
preset_id
(null)
api_type
"completions"
id
"gen-1780228034-i9v0LtyHSIEVZmnTYaVz"
upstream_id
"msg_014TP2TWGMn2yCjz5QxZn2WG"
total_cost
0.01121
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Anthropic"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Count dialogue tags
n/a
neededClean
false
dialogueTags
(empty)