NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-18-01KM084RQNP90VQDJXQ3Q8YW45
Run Details
Model:
GPT-5.4 Mini
Test:
Dialogue tags
Scenario:
Write unattributed dialogue
Duration:
2.1s
Cost:
$0.001490
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
13.53%
Count dialogue tags
(details)
Model input
User/Human
You are an expert novel writer in the cozy mystery genre. Write a scene of about 200 words, with dialogue between two characters. The dialogue should be unattributed, meaning the speaker is not explicitly mentioned. The scene should be set in a small town bakery. Only return the story, nothing else.
Model response
Messages
Meta
AI
The bell over the bakery door gave its cheerful little jingle, and warm air wrapped around me like a blanket. Cinnamon, butter, and fresh bread drifted out from the kitchen, making my stomach growl in spite of the late hour. “Tell me that isn’t the last peach tart.” “It is, unless you can sweet-talk Mrs. Dallow into giving up hers.” “She’d sooner surrender her cat.” “Her cat has better manners.” A low laugh slipped from the back table. Mrs. Dallow, perched with her tea and a suspiciously full pastry box, lifted one eyebrow over the rim of her cup. “I heard that.” “Then you also heard
I said
your cat has better manners, which was a compliment.” “It was not.” The baker emerged from behind the counter with a tray of sugar-dusted scones. “If you’re both here to argue, at least argue slowly. These need five more minutes before they’re ready.” “No one comes to Maple Street Bakery to argue,”
I said
. The baker set the tray down and glanced toward the window, where the town square glowed softly under the streetlamps. “Oh? Then what do they come for?” Mrs. Dallow closed her pastry box with a delicate snap. “To gossip, dear. And tonight, they’ve got a fine story already.” The baker frowned. “What story?” “The one about the missing key,”
I said
, lowering my voice. “And the odd footprints by the back door.” For a moment, even the ovens seemed to go quiet.
created_at
"2026-03-18T10:33:58.274Z"
3/18/2026, 10:33:58 AM
model
"openai/gpt-5.4-mini-20260317"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
385
moderation_latency
(null)
generation_time
1962
tokens_prompt
81
tokens_completion
392
native_tokens_prompt
67
native_tokens_completion
320
native_tokens_completion_images
0
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.00149025
router
(null)
provider_responses
0
endpoint_id
"9ee065b2-3d1c-43bc-bdd7-28af3b148282"
id
"resp_0e96188baf818a6e0169ba7f965614819384da6a0ca600552d"
is_byok
false
latency
198
model_permaslug
"openai/gpt-5.4-mini-20260317"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773830038-ABAGcBlgKKui1Yu3Uk0w"
upstream_id
"resp_0e96188baf818a6e0169ba7f965614819384da6a0ca600552d"
total_cost
0.00149025
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
13.53%
Count dialogue tags
n/a
neededClean
false
dialogueTags
0
"I said"
1
"I said"