NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-18-01KM09MGKG9805V3C78TR1M8Z9
Run Details
Model:
GPT-5.4 Mini (Reasoning, Low)
Test:
Dialogue tags
Scenario:
Write unattributed dialogue
Duration:
4.0s
Cost:
$0.001711
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
60.65%
Count dialogue tags
(details)
Model input
User/Human
You are an expert novel writer in the cozy mystery genre. Write a scene of about 200 words, with dialogue between two characters. The dialogue should be unattributed, meaning the speaker is not explicitly mentioned. The scene should be set in a small town bakery. Only return the story, nothing else.
Model response
Messages
Meta
AI
The bell over Millie’s bakery door gave its cheerful little jingle just as the rain started tapping the front windows. Warm air rushed out, carrying cinnamon, butter, and the faintest hint of coffee. “Well,”
said one of the customers
, peering at the pastry case, “if I die of indecision, you’ll need to put it on a peach tart.” “I’d call that a noble end,” came the reply from behind the counter. “I’m serious. Why do your scones always look like they know a secret?” “Because they do.” The customer laughed, then leaned closer. “Speaking of secrets, Mrs. Dalloway says the old Whitcomb place is being emptied out tonight.” “Old Whitcomb place has been empty for years.” “Not empty-empty. Boxes. Men. Flashlights.” Millie paused with a tray of thumbprint cookies in her hands. “Flashlights?” “That’s what I heard.” A tiny wrinkle appeared between Millie’s brows. “From Mrs. Dalloway?” “From Mrs. Dalloway, who heard it from the postman, who heard it from someone at town hall. So it’s practically official.” Outside, thunder muttered over Main Street. Inside, the bakery hummed softly, but the air had shifted, as if the rain had brought more than weather. Millie set down the tray. “Take the peach tart.” “Thought you’d say that.” “And if you happen to pass the Whitcomb place later—” “I’ll mind my own business?” Millie smiled. “No. You’ll tell me what you see.”
created_at
"2026-03-18T11:00:02.814Z"
3/18/2026, 11:00:02 AM
model
"openai/gpt-5.4-mini-20260317"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
3108
moderation_latency
(null)
generation_time
3750
tokens_prompt
81
tokens_completion
521
native_tokens_prompt
67
native_tokens_completion
369
native_tokens_completion_images
0
native_tokens_reasoning
35
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.00171075
router
(null)
provider_responses
0
endpoint_id
"9ee065b2-3d1c-43bc-bdd7-28af3b148282"
id
"resp_0a94532c1c2975420169ba85b2e21c8194afc924357f7f2928"
is_byok
false
latency
206
model_permaslug
"openai/gpt-5.4-mini-20260317"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773831602-qZjR5QQVnbDPZo8fMBdm"
upstream_id
"resp_0a94532c1c2975420169ba85b2e21c8194afc924357f7f2928"
total_cost
0.00171075
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
60.65%
Count dialogue tags
n/a
neededClean
false
dialogueTags
0
"said one of the customers"