NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-24-01KQ03F2G2F6Z4K4TQKZ8XK4XV
Run Details
Model:
GPT-OSS 120B
Test:
Dialogue tags
Scenario:
Write unattributed dialogue
Duration:
8.5s
Cost:
$0.000088
Provider:
DeepInfra
Evaluation summary
Score
Evaluator
Details Link
13.53%
Count dialogue tags
(details)
Model input
User/Human
You are an expert novel writer in the cozy mystery genre. Write a scene of about 200 words, with dialogue between two characters. The dialogue should be unattributed, meaning the speaker is not explicitly mentioned. The scene should be set in a small town bakery. Only return the story, nothing else.
Model response
Messages
Meta
AI
The scent of cinnamon rolls and fresh coffee wrapped the tiny shop like a warm blanket. A tin bell chimed as the door swung shut, and the low hum of the refrigerator blended with the soft clatter of porcelain. "Did you hear about the mayor's cat disappearing again?"
the voice whispered
, barely above the whir of the mixer. "The only thing that’s missing is the secret recipe for Mrs. Harrow’s famous lemon tarts,"
the other replied
, pulling a napkin from the stack and folding it neatly. A quiet sigh drifted from the back room, where a tray of scones waited. "I found a note tucked in the flour sack—'Meet me at the old pier at midnight.'" The sound of a rolling pin tapping the countertop punctuated the words. "Someone’s trying to stir up trouble, and I’m not about to let a missing cat ruin our bake sale." A gentle laugh bubbled from the corner, where a window displayed a row of pastel cupcakes. "If you need an alibi, I can vouch for you—we’ll be here, frosting the case, not the cake." The bell jingled again as the shop door opened, a gust of cool evening wind scattering a handful of rose petals onto the flour‑dusted floor.
created_at
"2026-04-24T15:59:18.542Z"
4/24/2026, 3:59:18 PM
model
"openai/gpt-oss-120b"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
187
moderation_latency
(null)
generation_time
8040
tokens_prompt
81
tokens_completion
499
native_tokens_prompt
115
native_tokens_completion
438
native_tokens_completion_images
(null)
native_tokens_reasoning
198
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.000087705
router
(null)
provider_responses
0
endpoint_id
"85835306-dba3-4334-87bc-ab9ffa5c9c62"
id
"chatcmpl-RNG9j782u5mcdfUBS83o2Sy3"
is_byok
false
latency
79
model_permaslug
"openai/gpt-oss-120b"
provider_name
"DeepInfra"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777046358-Lfw0XdsOiRnoyQIXQJYZ"
session_id
(null)
api_type
"completions"
id
"gen-1777046358-Dhk1VN9BETQPx3nkDZqD"
upstream_id
"chatcmpl-RNG9j782u5mcdfUBS83o2Sy3"
total_cost
0.000087705
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepInfra"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
13.53%
Count dialogue tags
n/a
neededClean
false
dialogueTags
0
"the voice whispered"
1
"the other replied"