NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-24-01KQ0N4S1BXT2XHP2NQPQN7HJ7
Run Details
Model:
GPT-5.5 (Reasoning, Low)
Test:
Dialogue tags
Scenario:
Write unattributed dialogue
Duration:
12.0s
Cost:
$0.010535
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
100.00%
Count dialogue tags
(details)
Model input
User/Human
You are an expert novel writer in the cozy mystery genre. Write a scene of about 200 words, with dialogue between two characters. The dialogue should be unattributed, meaning the speaker is not explicitly mentioned. The scene should be set in a small town bakery. Only return the story, nothing else.
Model response
Messages
Meta
AI
The bell over the bakery door gave its usual cheerful jingle, though the rain outside had turned the square the color of cold tea. Warm cinnamon and lemon sugar wrapped around the little shop, softening even the sight of the flour-dusted footprint on the black-and-white tiles. “You’re early.” “You left a note under my scone.” “I left a receipt.” “With the words ‘come before opening’ written on the back.” “That does sound like me.” Behind the counter, trays of currant buns cooled beside a three-tiered wedding cake, its frosting roses as perfect as porcelain. On the lowest tier, tucked beneath a sugared petal, glinted something silver. “Is that Mrs. Pritchard’s missing brooch?” “I was hoping you’d say it was a decorative almond.” “In the shape of a swan, with emerald eyes?” “Unusual almond.” A gust rattled the windowpanes. Across the street, the clock on the town hall struck seven. “She accused half the garden club of stealing it yesterday.” “And then ordered twelve lemon tarts, paid in exact change, and left without her umbrella.” “Who picked up the cake after closing?” “No one was meant to.” “Then someone has a key.” “Or a sweet tooth and poor judgment.” “Don’t touch it.” “I wasn’t going to.” “You’re holding a pastry fork.” “For courage.”
created_at
"2026-04-24T21:08:15.543Z"
4/24/2026, 9:08:15 PM
model
"openai/gpt-5.5-20260423"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
6531
moderation_latency
(null)
generation_time
11830
tokens_prompt
81
tokens_completion
512
native_tokens_prompt
67
native_tokens_completion
340
native_tokens_completion_images
0
native_tokens_reasoning
49
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.010535
router
(null)
provider_responses
0
endpoint_id
"58e5b336-423e-430b-a2ab-8bc353f0c51b"
id
"resp_09eb75ff6f65d8fe0169ebdbbfa3d8819098255499612e02bc"
is_byok
false
latency
364
model_permaslug
"openai/gpt-5.5-20260423"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777064895-4OFeWd8wf3brWxLLuly0"
session_id
(null)
api_type
"completions"
id
"gen-1777064895-vfmPfbhjGnE9cNQfgqdx"
upstream_id
"resp_09eb75ff6f65d8fe0169ebdbbfa3d8819098255499612e02bc"
total_cost
0.010535
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Count dialogue tags
n/a
neededClean
false
dialogueTags
(empty)