NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-18-01KM09KQHNAEGZFH9YC4R6X1GE
Run Details
Model:
GPT-5.4 Mini
Test:
Dialogue tags
Scenario:
Write unattributed dialogue
Duration:
3.0s
Cost:
$0.001522
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
60.65%
Count dialogue tags
(details)
Model input
User/Human
You are an expert novel writer in the cozy mystery genre. Write a scene of about 200 words, with dialogue between two characters. The dialogue should be unattributed, meaning the speaker is not explicitly mentioned. The scene should be set in a small town bakery. Only return the story, nothing else.
Model response
Messages
Meta
AI
The bell over the bakery door gave its cheerful jingle as sunlight spilled across the tiled floor and settled warm on the display case. Cinnamon, butter, and fresh bread wrapped around the room like a hug. “You’re late.” “The clock in my kitchen is optimistic.” “It’s ten minutes fast?” “Exactly. Very supportive of my habits.” A soft laugh drifted from behind the counter, where trays of lemon scones cooled beside a row of plum turnovers. Through the front window, Main Street shimmered in the morning heat, the little town already waking in bits and pieces: the florist dragging a bucket of dahlias inside, the postmaster pausing to adjust his cap, Mrs. Pritchard peering in as if she meant to purchase the entire bakery one bun at a time. “You smell like smoke.” “I was near the church bake sale. Someone’s pie may have met a tragic end.” “Whose pie?” “No idea. But the fire truck came.” “Is everyone all right?” “So far. Unless you count the pie.” A hand tapped the glass of the case, thoughtful, then urgent. “Tell me you didn’t see anything strange.” “Define strange.” “That’s not reassuring.” “Then no, I didn’t.” The smile that answered was small and not entirely convinced. From the back room came the thump of a falling pan, followed by a muttered curse. “Well,”
said one voice
, reaching for a paper bag, “if the bakery’s going to have another mystery before noon, we’ll need coffee.” “And something sweet.” “Always something sweet.”
created_at
"2026-03-18T10:59:37.156Z"
3/18/2026, 10:59:37 AM
model
"openai/gpt-5.4-mini-20260317"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
861
moderation_latency
(null)
generation_time
2843
tokens_prompt
81
tokens_completion
429
native_tokens_prompt
67
native_tokens_completion
327
native_tokens_completion_images
0
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.00152175
router
(null)
provider_responses
0
endpoint_id
"9ee065b2-3d1c-43bc-bdd7-28af3b148282"
id
"resp_02ecfbf22c3211a30169ba85993c048195ab8598ae5e76ebad"
is_byok
false
latency
223
model_permaslug
"openai/gpt-5.4-mini-20260317"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773831577-jDq1jT3AUTR2NIYEfYaw"
upstream_id
"resp_02ecfbf22c3211a30169ba85993c048195ab8598ae5e76ebad"
total_cost
0.00152175
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
60.65%
Count dialogue tags
n/a
neededClean
false
dialogueTags
0
"said one voice"