NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-18-01KM093KMGH2FZ0YR8QK7N36KX
Run Details
Model:
GPT-5.4 Mini (Reasoning, Low)
Test:
Dialogue tags
Scenario:
Write unattributed dialogue
Duration:
3.7s
Cost:
$0.001621
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
1.11%
Count dialogue tags
(details)
Model input
User/Human
You are an expert novel writer in the cozy mystery genre. Write a scene of about 200 words, with dialogue between two characters. The dialogue should be unattributed, meaning the speaker is not explicitly mentioned. The scene should be set in a small town bakery. Only return the story, nothing else.
Model response
Messages
Meta
AI
The bell over Marigold Bakery chimed as Lena carried in a paper sack of flour and a growing sense of dread. “Tell me that’s not the same man from the grocery store.” “It is,”
said Nora
, peering through the front window. “And he’s been standing by the maple display for five minutes.” “Maybe he likes maple bars.” “Maybe he likes asking questions about old keys and dead aunts.” Lena set the sack down a little too hard. “He asked you about Aunt Elsie?” “He asked if she ever kept a recipe box in the brick oven room.” Lena stared at her. “Why would he ask that?” “That’s what I was hoping you’d tell me.” The bell jingled again, and a blast of cold air followed the man inside. He was dry-shod, neat as a church bulletin, with kind eyes that made him look trustworthy in the way snakes sometimes did. “Afternoon,”
he said
. “I’m looking for a butter tart and a bit of information.” “We’re fresh out of information,”
Nora said
, sliding a tray of cinnamon rolls forward. “But the tarts are excellent.” He smiled, but his gaze drifted past the display case to the back door. “Funny thing. Your aunt’s recipe cards were supposed to be locked away.” Lena went still. Nora’s voice turned light as powdered sugar. “Supposed to be by whom?” The man reached into his coat and produced a tarnished brass key. “By whoever took the first one.”
created_at
"2026-03-18T10:50:48.863Z"
3/18/2026, 10:50:48 AM
model
"openai/gpt-5.4-mini-20260317"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
3067
moderation_latency
(null)
generation_time
3413
tokens_prompt
81
tokens_completion
512
native_tokens_prompt
67
native_tokens_completion
349
native_tokens_completion_images
0
native_tokens_reasoning
28
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.00162075
router
(null)
provider_responses
0
endpoint_id
"9ee065b2-3d1c-43bc-bdd7-28af3b148282"
id
"resp_099b5a150b1330d80169ba8388ed708196bb3d1dd67aeeccfc"
is_byok
false
latency
186
model_permaslug
"openai/gpt-5.4-mini-20260317"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773831048-DuPub1bnQhNerm9PpJOC"
upstream_id
"resp_099b5a150b1330d80169ba8388ed708196bb3d1dd67aeeccfc"
total_cost
0.00162075
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
1.11%
Count dialogue tags
n/a
neededClean
false
dialogueTags
0
"said Nora"
1
"he said"
2
"Nora said"