NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-24-01KQ0PSF85GD6JVJFCKXC3SJ48
Run Details
Model:
GPT-5.5 (Reasoning)
Test:
Dialogue tags
Scenario:
Write unattributed dialogue
Duration:
11.5s
Cost:
$0.011735
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
100.00%
Count dialogue tags
(details)
Model input
User/Human
You are an expert novel writer in the cozy mystery genre. Write a scene of about 200 words, with dialogue between two characters. The dialogue should be unattributed, meaning the speaker is not explicitly mentioned. The scene should be set in a small town bakery. Only return the story, nothing else.
Model response
Messages
Meta
AI
The bell over Crumb & Clover gave its usual silver jingle, though the bakery smelled less like cinnamon rolls and more like suspicion. Rain stitched the front windows, blurring Main Street into watercolor. “You’re early.” “And you’re dusting flour over an already spotless counter.” “It calms me.” “Finding Mayor Bell’s signet ring baked into a blackberry tart would unsettle anyone.” “It wasn’t baked in. It was tucked beneath the crust. Deliberately.” “So whoever hid it knew your pastry schedule.” “Half the town knows my pastry schedule. It’s on the chalkboard.” “Did anyone come through the kitchen yesterday?” “Deliveries, choir ladies, Mr. Pruitt complaining about too much nutmeg, and Tansy with the library raffle tickets.” “Tansy wears gloves, even in July.” “She has cold hands.” “She also argued with the mayor at the book sale.” “Everyone argued with the mayor. He wanted to turn the gazebo into parking.” A tray rattled in the cooling rack. Outside, a truck hissed through puddles. “There’s something else.” “I thought so. Your lemon scones are burning.” “They’re not. I set the timer.” “No, you didn’t.” The oven door flew open. Warm citrus rushed out, sharp and sweet. On the top scone, blackened at the edges, a smear of blue ink curled into a single word. “Gazebo.” “Looks like breakfast just became evidence.”
created_at
"2026-04-24T21:37:02.226Z"
4/24/2026, 9:37:02 PM
model
"openai/gpt-5.5-20260423"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
4855
moderation_latency
(null)
generation_time
11367
tokens_prompt
81
tokens_completion
522
native_tokens_prompt
67
native_tokens_completion
380
native_tokens_completion_images
0
native_tokens_reasoning
73
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.011735
router
(null)
provider_responses
0
endpoint_id
"58e5b336-423e-430b-a2ab-8bc353f0c51b"
id
"resp_0fbb5fd471309ad90169ebe27e55488195997f483b4590bd2c"
is_byok
false
latency
332
model_permaslug
"openai/gpt-5.5-20260423"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777066622-RkzixyYPzz4sKi0kmy5E"
session_id
(null)
api_type
"completions"
id
"gen-1777066622-d2POfICdlpXIHL4RtZOo"
upstream_id
"resp_0fbb5fd471309ad90169ebe27e55488195997f483b4590bd2c"
total_cost
0.011735
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Count dialogue tags
n/a
neededClean
false
dialogueTags
(empty)