NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-05-29-01KSSJJHRWBC2GAVMX1XBFDVCW
Run Details
Model:
Claude Opus 4.8 (Reasoning)
Test:
Dialogue tags
Scenario:
Write unattributed dialogue
Duration:
8.5s
Cost:
$0.011360
Provider:
Anthropic
Evaluation summary
Score
Evaluator
Details Link
100.00%
Count dialogue tags
(details)
Model input
User/Human
You are an expert novel writer in the cozy mystery genre. Write a scene of about 200 words, with dialogue between two characters. The dialogue should be unattributed, meaning the speaker is not explicitly mentioned. The scene should be set in a small town bakery. Only return the story, nothing else.
Model response
Messages
Meta
AI
The bell above the door chimed as the morning light spilled across the flour-dusted counter. "You're here early. Couldn't sleep either?" "How could I? After what happened to poor Mr. Abernathy at the church social, the whole town's gone sideways." "Keep your voice down. The walls in this place are thinner than my pie crust." "I'm just saying, it's strange. He'd been coming here every Tuesday for twenty years. Same order. Two cinnamon rolls and a black coffee." "And yesterday?" "That's the thing. Yesterday he asked for the lemon tart. Said someone recommended it." "Who would recommend the lemon tart? I only made three of them, and they were special order for the Whitcombs." "Exactly. So how did Mr. Abernathy end up eating one?" A long silence settled between them, broken only by the soft tick of the oven timer. "You don't think—" "I don't know what to think. But I do know the sheriff was asking about my recipes this morning. Wanted to know if I use almond extract." "You don't, do you?" "Never have. But somebody's lemon tart did. And Mr. Abernathy was allergic to almonds. Everyone in this town knew that." "Everyone," came the quiet reply, "except whoever made that tart."
created_at
"2026-05-29T09:55:45.575Z"
5/29/2026, 9:55:45 AM
model
"anthropic/claude-4.8-opus-20260528"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
997
moderation_latency
(null)
generation_time
8485
tokens_prompt
81
tokens_completion
322
native_tokens_prompt
107
native_tokens_completion
433
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"end_turn"
service_tier
"default"
usage
0.01136
router
(null)
provider_responses
0
endpoint_id
"b0d3beaa-5ff7-4666-a749-b1bb2b14c916"
id
"msg_01NzC2pBpchECpwYpQzxCquv"
is_byok
false
latency
997
model_permaslug
"anthropic/claude-4.8-opus-20260528"
provider_name
"Anthropic"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.16.0; linux; x64))"
http_referer
(null)
request_id
"req-1780048545-m1RMwB2iWU5dlVKtVgUu"
session_id
(null)
api_type
"completions"
id
"gen-1780048545-QNFWHd4sLK5sfhWxb1Wj"
upstream_id
"msg_01NzC2pBpchECpwYpQzxCquv"
total_cost
0.01136
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Anthropic"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Count dialogue tags
n/a
neededClean
false
dialogueTags
(empty)