NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-24-01KQ079TWVCJC6CZ8XCRBDQZTY
Run Details
Model:
GPT-OSS 120B
Test:
Dialogue tags
Scenario:
Write unattributed dialogue
Duration:
24.0s
Cost:
$0.000209
Provider:
Phala
Evaluation summary
Score
Evaluator
Details Link
60.65%
Count dialogue tags
(details)
Model input
User/Human
You are an expert novel writer in the cozy mystery genre. Write a scene of about 200 words, with dialogue between two characters. The dialogue should be unattributed, meaning the speaker is not explicitly mentioned. The scene should be set in a small town bakery. Only return the story, nothing else.
Model response
Messages
Meta
AI
Steam rose from the fresh croissants as the bell above the door jingled. A warm scent of cinnamon swirled around the cracked tiles. “Did you hear about the missing heirloom?”
a voice whispered
, barely louder than the hum of the mixer. The other chuckled, wiping flour from the apron. “Only because Mrs. Whitaker bragged it was safe in her pantry. She never locks anything.” A kettle whistled, drawing a brief pause. “And yet the police found the jewelry in the backroom of the thrift shop, right under the floorboard.” “Exactly. Someone must have slipped it out while we were busy with the morning rush.” The cursor of a pen traced a quick note on a napkin. “I’m thinking the delivery boy—he’s always lingering near the back door.” A soft clatter of plates punctuated the thought. “He’s got an alibi, though. He was there when the rain started, and the shop’s CCTV shows him loading boxes.” The baker smiled, a crumb dusting her cheek. “Alibis are easy to forge. What about the missing batch of lemon tarts? They vanished the night before the gala.” A sudden creak echoed from the storage closet. “Maybe the answer has been hiding in plain sight, right under the counter all along.”
created_at
"2026-04-24T17:06:21.221Z"
4/24/2026, 5:06:21 PM
model
"openai/gpt-oss-120b"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1143
moderation_latency
(null)
generation_time
23968
tokens_prompt
81
tokens_completion
462
native_tokens_prompt
130
native_tokens_completion
399
native_tokens_completion_images
(null)
native_tokens_reasoning
138
native_tokens_cached
64
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.00020851
router
(null)
provider_responses
0
endpoint_id
"909fdbc8-1b02-46ea-a9fc-cc3cfe65d992"
id
"chatcmpl-9d54ee5c8e8b2190"
is_byok
false
latency
1081
model_permaslug
"openai/gpt-oss-120b"
provider_name
"Phala"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777050381-PTzf0cWYgKBXMEalsWvw"
session_id
(null)
api_type
"completions"
id
"gen-1777050381-aTR1JhrZcJEutEUH5Fdh"
upstream_id
"chatcmpl-9d54ee5c8e8b2190"
total_cost
0.00020851
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Phala"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
60.65%
Count dialogue tags
n/a
neededClean
false
dialogueTags
0
"a voice whispered"