NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-18-01KM086D02ZTQ05PZCJ1YS806K
Run Details
Model:
GPT-5.4 Nano (Reasoning, Low)
Test:
Dialogue tags
Scenario:
Write 200 words with 50% dialogue
Duration:
12.9s
Cost:
$0.002211
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
100.00%
Matches word count
(details)
100.00%
Dialogue to Total Word Ratio
(details)
99.9984%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Bill knelt by the toy store window, listening to distant laughter. “Remember,” Molly whispered, “we’re professionals, not sugar-rushers.” Bill grinned, holding a rubber duck disguised as a lock pick. “Duck first, drama later. Kids can’t file fingerprints with birthday candles, right?” Molly slid a cart of balloons aside. “They can’t, but their parents can call security.” “Relax,” Bill said, “I brought the distraction: confetti and plausible deniability.” “Confetti doesn’t talk,” Molly replied, “but you do when you’re nervous.” Bill opened the vent cover. “Nervous? I’m artisanal.” The house smelled like cake and carpet cleaner. “Okay,” Molly breathed, “our target is the present pile, not the pinata.” “I’m aiming for the big ones,” Bill said, “the ones with bows that scream ‘steal me.’” Molly checked the guest list on her phone. “Last year you took a toy microscope.” “It was educational crime,” Bill insisted. The doorbell jingled somewhere inside, then stopped. “Time to move,” Molly said. They slid in like two bad ideas with good shoes. A clown horn blared, and they counted breaths as the hallway lights flickered toward the gift table quietly tonight. “Smile,” Molly mouthed, “then grab everything with ribbon. If you hear squeals, freeze and look innocent.”
created_at
"2026-03-18T10:34:51.803Z"
3/18/2026, 10:34:51 AM
model
"openai/gpt-5.4-nano-20260317"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
12433
moderation_latency
(null)
generation_time
12694
tokens_prompt
77
tokens_completion
613
native_tokens_prompt
66
native_tokens_completion
1758
native_tokens_completion_images
0
native_tokens_reasoning
1457
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0022107
router
(null)
provider_responses
0
endpoint_id
"0c835f2e-c18d-4e8c-b245-e1e3bd08b97f"
id
"resp_05382a625d92a0df0169ba7fcbde2c8196898a3d9108746e4d"
is_byok
false
latency
233
model_permaslug
"openai/gpt-5.4-nano-20260317"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773830091-RYFfL9FNs0WBEVNN1JSX"
upstream_id
"resp_05382a625d92a0df0169ba7fcbde2c8196898a3d9108746e4d"
total_cost
0.0022107
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Matches word count
n/a
neededClean
false
words
200
100.00%
Dialogue to Total Word Ratio
Ratio: 49.25%, Deviation: 0.75%
neededClean
false
wordsTotal
201
wordsDialogue
99
99.9984%