NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-18-01KM0ADXKHJDJ9S7YHWVJFGVYT
Run Details
Model:
GPT-5.4 Nano (Reasoning)
Test:
Dialogue tags
Scenario:
Write 200 words with 10% dialogue
Duration:
24.9s
Cost:
$0.003801
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
100.00%
Matches word count
(details)
99.84%
Dialogue to Total Word Ratio
(details)
99.9201%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 10% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Bill slid his gloved hand along the balloon string, feeling it tremble like a nervous beagle. Molly crouched by the doorway, her crowbar taped to a toy telescope for plausible innocence. Music thudded; laughter leaked through the wall; frosting perfumed the air with optimism and sticky evidence. She pointed at the gift table, where shiny boxes sat under a paper crown. “Bill, quiet.” Bill nodded, then whispered, “The presents only; no kids, no parents.” Molly grinned and produced a roll of contraband duct tape shaped like a lollipop. “I’ve got gloves.” “You take, I watch.” They moved when the magician turned his back, juggling scarves that smelled suspiciously like rehearsal. Bill wedged a foot under the first bow; it squealed once, then settled. Molly stuffed smaller parcels into her sack, trying not to breathe on the labels. A toddler barreled past, squealing, and the mother thanked the air for “being patient.” Their timing tightened. “Wraps, then vanish,” Molly said. “Tonight,” Bill answered, already backing toward the window where their exit plan waited, disguised as a harmless curtain. Outside, Bill’s van idled two streets away, and the radio played “Happy Birthday” in minor key, which somehow felt perfect for bad luck.
created_at
"2026-03-18T11:13:55.326Z"
3/18/2026, 11:13:55 AM
model
"openai/gpt-5.4-nano-20260317"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
24366
moderation_latency
(null)
generation_time
24640
tokens_prompt
77
tokens_completion
1162
native_tokens_prompt
66
native_tokens_completion
3030
native_tokens_completion_images
0
native_tokens_reasoning
2753
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0038007
router
(null)
provider_responses
0
endpoint_id
"0c835f2e-c18d-4e8c-b245-e1e3bd08b97f"
id
"resp_01c96dbe2b39e2260169ba88f36b8081908c23a63373bceb7c"
is_byok
false
latency
237
model_permaslug
"openai/gpt-5.4-nano-20260317"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773832435-jBXpDi1QAkT9So70jDtj"
upstream_id
"resp_01c96dbe2b39e2260169ba88f36b8081908c23a63373bceb7c"
total_cost
0.0038007
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Matches word count
n/a
neededClean
false
words
200
99.84%
Dialogue to Total Word Ratio
Ratio: 12.00%, Deviation: 2.00%
neededClean
false
wordsTotal
200
wordsDialogue
24
99.9201%