NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-18-01KM09K117C4B8XWKRGDS95EWB
Run Details
Model:
GPT-5.4 Nano (Reasoning, Low)
Test:
Dialogue tags
Scenario:
Write 200 words with 10% dialogue
Duration:
3.4s
Cost:
$0.000542
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
0.49%
Matches word count
(details)
99.90%
Dialogue to Total Word Ratio
(details)
50.1947%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 10% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Bill crouched behind the hydrangeas, whispering to his shoelaces like they were suspects. Molly adjusted her fake mustache, which looked less like a disguise and more like a birthday candle gone rogue. The backyard next door glowed with paper lanterns and suspiciously cheerful music. Somewhere inside, children were counting down to cake. Bill held up a clipboard. It was, inexplicably, a real school permission slip he’d stolen from a mailbox. “See?” he said, pointing to the neat signature. “We’re approved.” A security camera blinked from the porch, its red eye doing nothing but judging their posture. Molly tapped her earpiece—actually a toy walkie-talkie—and tried to look invisible, which mostly meant she leaned into the wind like a dramatic flag. The front door opened. A balloon escaped and floated upward, carrying a banner that read HAPPY PARTY! in letters large enough to qualify as evidence. Bill glanced at the presents stacked by the tree, each one wrapped in glossy chaos. “Quiet plan,” he mouthed, as if the gifts could overhear. Molly nodded. “We grab all of them. Not the toy train. It’s cursed.” Bill swallowed. “That’s exactly what the last thief said.” They slipped toward the threshold, careful to step only on the squeaky floorboards they’d already bribed with gummy bears. The music swelled, then abruptly cut out—right as the host’s voice announced, “And now, the surprise!”
created_at
"2026-03-18T10:59:14.113Z"
3/18/2026, 10:59:14 AM
model
"openai/gpt-5.4-nano-20260317"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
2817
moderation_latency
(null)
generation_time
3119
tokens_prompt
77
tokens_completion
490
native_tokens_prompt
66
native_tokens_completion
423
native_tokens_completion_images
0
native_tokens_reasoning
109
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.00054195
router
(null)
provider_responses
0
endpoint_id
"0c835f2e-c18d-4e8c-b245-e1e3bd08b97f"
id
"resp_02a5fe03033365700169ba85822ef88197b69b0cb9f4098495"
is_byok
false
latency
202
model_permaslug
"openai/gpt-5.4-nano-20260317"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773831554-fd0knxQ6rTsXpvci4FLT"
upstream_id
"resp_02a5fe03033365700169ba85822ef88197b69b0cb9f4098495"
total_cost
0.00054195
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
0.49%
Matches word count
n/a
neededClean
false
words
227
99.90%
Dialogue to Total Word Ratio
Ratio: 11.79%, Deviation: 1.79%
neededClean
false
wordsTotal
229
wordsDialogue
27
50.1947%