NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-18-01KM094GARHYNFDN9ECYJZF3KT
Run Details
Model:
GPT-5.4 Mini (Reasoning, Low)
Test:
Dialogue tags
Scenario:
Write unattributed dialogue
Duration:
3.0s
Cost:
$0.001643
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
13.53%
Count dialogue tags
(details)
Model input
User/Human
You are an expert novel writer in the cozy mystery genre. Write a scene of about 200 words, with dialogue between two characters. The dialogue should be unattributed, meaning the speaker is not explicitly mentioned. The scene should be set in a small town bakery. Only return the story, nothing else.
Model response
Messages
Meta
AI
The bell over Maribel’s Bakery gave its cheerful jingle as Nora stepped inside, shaking rain from her coat. Warmth wrapped around her at once, sweet with cinnamon, butter, and the faint tang of fresh coffee. “You’re late,”
said the woman behind the counter
, sliding a tray of lemon bars into the display case. “I know. The town square was blocked off again.” “For what this time?” “Apparently someone found a dead raccoon in the fountain.” A small snort of laughter came from the oven. “Only in Willow Creek.” Nora lowered her voice. “Have you heard about Mr. Pritchard?” “The hardware man?” “He says his ledger was stolen.” Maribel wiped her hands on her apron and raised an eyebrow. “Stolen from where?” “From his locked office.” “That sounds dramatic.” “That’s because he’s dramatic.” Maribel reached for a bag and began filling it with still-warm scones. “Did he lose anything besides the ledger?” “Only his temper.” “Then perhaps the thief did the town a favor.” Nora smiled despite herself. “He thinks someone’s been peeking at his accounts.” “Someone always is, if he leaves them on the counter.” Nora leaned closer as the bell jingled again and a gust of wet air slipped in. “He also said the same person asked for your pecan pie recipe.” Maribel froze, then set the bag down very carefully. “Well,”
she said
, voice light as sugar, “that is a much more interesting problem.”
created_at
"2026-03-18T10:51:18.246Z"
3/18/2026, 10:51:18 AM
model
"openai/gpt-5.4-mini-20260317"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
2437
moderation_latency
(null)
generation_time
2792
tokens_prompt
81
tokens_completion
529
native_tokens_prompt
67
native_tokens_completion
354
native_tokens_completion_images
0
native_tokens_reasoning
33
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.00164325
router
(null)
provider_responses
0
endpoint_id
"9ee065b2-3d1c-43bc-bdd7-28af3b148282"
id
"resp_094486676ddb0a9b0169ba83a64ec881939d0ce4773fe29fe4"
is_byok
false
latency
181
model_permaslug
"openai/gpt-5.4-mini-20260317"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773831078-yUDNyiUZnkzkVk5lL6bJ"
upstream_id
"resp_094486676ddb0a9b0169ba83a64ec881939d0ce4773fe29fe4"
total_cost
0.00164325
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
13.53%
Count dialogue tags
n/a
neededClean
false
dialogueTags
0
"said the woman behind the counter"
1
"she said"