NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-24-01KQ06E4QT7GTAGXQSKYEEXXCY
Run Details
Model:
GPT-OSS 120B
Test:
Dialogue tags
Scenario:
Write unattributed dialogue
Duration:
13.4s
Cost:
$0.000096
Provider:
DeepInfra
Evaluation summary
Score
Evaluator
Details Link
100.00%
Count dialogue tags
(details)
Model input
User/Human
You are an expert novel writer in the cozy mystery genre. Write a scene of about 200 words, with dialogue between two characters. The dialogue should be unattributed, meaning the speaker is not explicitly mentioned. The scene should be set in a small town bakery. Only return the story, nothing else.
Model response
Messages
Meta
AI
The bakery hummed with the soft clatter of mixing bowls and the warm scent of crusty baguettes. Sunlight filtered through the flour‑dusted windows, casting a honeyed glow over the glossy display of pastries. A tiny bell jingled as the door opened, and a brisk wind carried in a hint of rain. “Morning, Claire. Got any of those almond croissants left?” the voice was low, the words almost tasting the sugar in the air. “Only half a dozen, and one’s already missing.” Claire wiped her hands on a flour‑spattered apron, her eyes flicking to the back room where a stack of paper receipts lay askew. “Missing? As in… stolen?” a chuckle floated from the doorway, tinged with curiosity. “Someone knocked over the box last night, and the ledger’s gone. I swear I heard a footstep on the tile.” She set a fresh croissant on the counter, its buttery layers puffing in the heat. “Could be the wind. Or maybe the old cat that sneaks in through the back door.” The other figure shuffled a notebook, eyes scanning the familiar rows of names. “Mrs. Duvall’s cat? She’d be furious if he’s still stealing her rosemary scones.” Claire smiled, but a ripple of unease brushed her thoughts. “Then let’s keep an eye on the cat. And perhaps on anyone else who loves a good croissant a little too much.” The bell rang again, announcing a new customer and a new possibility.
created_at
"2026-04-24T16:51:13.797Z"
4/24/2026, 4:51:13 PM
model
"openai/gpt-oss-120b"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
211
moderation_latency
(null)
generation_time
13239
tokens_prompt
81
tokens_completion
550
native_tokens_prompt
115
native_tokens_completion
484
native_tokens_completion_images
(null)
native_tokens_reasoning
183
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.000096445
router
(null)
provider_responses
0
endpoint_id
"85835306-dba3-4334-87bc-ab9ffa5c9c62"
id
"chatcmpl-RRtgWwAP7sx2wOFDGvH2e5oI"
is_byok
false
latency
81
model_permaslug
"openai/gpt-oss-120b"
provider_name
"DeepInfra"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777049473-ZA92vF0YAsKjmeqOhu6O"
session_id
(null)
api_type
"completions"
id
"gen-1777049473-WRwJGmSwNbnK1XfB3vJJ"
upstream_id
"chatcmpl-RRtgWwAP7sx2wOFDGvH2e5oI"
total_cost
0.000096445
cache_discount
(null)
upstream_inference_cost
0
provider_name
"DeepInfra"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Count dialogue tags
n/a
neededClean
false
dialogueTags
(empty)