NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-24-01KQ0PHSK672V8QCZCPZ4X3SS3
Run Details
Model:
GPT-5.5
Test:
Dialogue tags
Scenario:
Write unattributed dialogue
Duration:
9.2s
Cost:
$0.009185
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
100.00%
Count dialogue tags
(details)
Model input
User/Human
You are an expert novel writer in the cozy mystery genre. Write a scene of about 200 words, with dialogue between two characters. The dialogue should be unattributed, meaning the speaker is not explicitly mentioned. The scene should be set in a small town bakery. Only return the story, nothing else.
Model response
Messages
Meta
AI
The bell over Honeycrust Bakery’s door gave its usual timid jingle, nearly lost beneath the hiss of the espresso machine and the rain ticking against the windows. Warm cinnamon clung to the air, though the tray of apple buns on the counter sat untouched. “You’re early.” “Couldn’t sleep.” “On account of the mayor turning up dead behind the flour mill, or on account of the button I found in the lemon tart box?” A pause settled between the glass case and the cooling racks. “You shouldn’t have told anyone about that.” “I didn’t. I’m telling you.” “That’s worse.” “It’s mother-of-pearl, isn’t it? From a man’s cuff. Expensive.” “Lots of men own fancy cuffs.” “Not in Briar Glen. Not unless they want everyone to know.” A gust pushed the door open an inch, and both glanced toward the empty sidewalk, where puddles trembled under the striped awning. “The sheriff thinks it was a drifter.” “The sheriff also thinks my sourdough starter is a pet.” “Fair point.” “Yesterday, the mayor bought six lemon tarts. Said they were for the council meeting.” “There wasn’t a council meeting.” “No.” The oven timer chimed, bright and cheerful. “So who were the tarts for?” “That’s what I hoped you’d tell me.” “Why me?” “Because your missing cuff has five buttons.”
created_at
"2026-04-24T21:32:50.67Z"
4/24/2026, 9:32:50 PM
model
"openai/gpt-5.5-20260423"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1723
moderation_latency
(null)
generation_time
9135
tokens_prompt
81
tokens_completion
390
native_tokens_prompt
67
native_tokens_completion
295
native_tokens_completion_images
0
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.009185
router
(null)
provider_responses
0
endpoint_id
"58e5b336-423e-430b-a2ab-8bc353f0c51b"
id
"resp_07978472ddbff49a0169ebe1836bc8819487ca5ca95aa50746"
is_byok
false
latency
638
model_permaslug
"openai/gpt-5.5-20260423"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777066370-UHcFrHv2j9PFEqexlDJ9"
session_id
(null)
api_type
"completions"
id
"gen-1777066370-J5dkVh7tvtlhmNJCS0Xh"
upstream_id
"resp_07978472ddbff49a0169ebe1836bc8819487ca5ca95aa50746"
total_cost
0.009185
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Count dialogue tags
n/a
neededClean
false
dialogueTags
(empty)