NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-18-01KM086XZA7GHYG2AK984QDPJT
Run Details
Model:
GPT-5.4 Mini
Test:
Dialogue tags
Scenario:
Write unattributed dialogue
Duration:
2.1s
Cost:
$0.001391
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
13.53%
Count dialogue tags
(details)
Model input
User/Human
You are an expert novel writer in the cozy mystery genre. Write a scene of about 200 words, with dialogue between two characters. The dialogue should be unattributed, meaning the speaker is not explicitly mentioned. The scene should be set in a small town bakery. Only return the story, nothing else.
Model response
Messages
Meta
AI
The bell over the bakery door gave a cheerful jingle as Clara stepped inside, bringing a gust of cold morning air with her. Warmth, sugar, and butter wrapped around her like a blanket. Behind the glass case, rows of lemon bars, cinnamon twists, and glossy berry tarts gleamed under the lights. “You’re late.” “Good morning to you too.” “I saved you the last blueberry scone.” “Then I forgive you.” A tin mixing bowl clinked somewhere in the back. Flour dusted the apron of Mrs. Bell, who stood with one hand on her hip and the other on a tray of cooling rolls. “You look like someone who has news,”
she said
. “I look like someone who got caught in a rainstorm on the way here.” “That too. But you’ve got that face.” “Which face?” “The one you make when you’re trying not to gossip.” Clara smiled despite herself and leaned over the case. “If I were gossiping, I’d say Mr. Finch was seen near the old mill last night.” Mrs. Bell lifted her brows. “At his age?” “Apparently so.” “Well,”
she said
, sliding the scone into a paper bag, “if he’s involved in anything suspicious, it had better not be near my shop. I’ve got tarts to sell and a pie contest to win.” Clara laughed. “In this town, those might be the same thing.”
created_at
"2026-03-18T10:35:09.174Z"
3/18/2026, 10:35:09 AM
model
"openai/gpt-5.4-mini-20260317"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
389
moderation_latency
(null)
generation_time
1918
tokens_prompt
81
tokens_completion
358
native_tokens_prompt
67
native_tokens_completion
298
native_tokens_completion_images
0
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.00139125
router
(null)
provider_responses
0
endpoint_id
"9ee065b2-3d1c-43bc-bdd7-28af3b148282"
id
"resp_013137624ebd40cf0169ba7fdd3b4c819580ccce52cf2fc08b"
is_byok
false
latency
186
model_permaslug
"openai/gpt-5.4-mini-20260317"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773830109-sOmMlXS0LuU7G2RbgX5T"
upstream_id
"resp_013137624ebd40cf0169ba7fdd3b4c819580ccce52cf2fc08b"
total_cost
0.00139125
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
13.53%
Count dialogue tags
n/a
neededClean
false
dialogueTags
0
"she said"
1
"she said"