NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-18-01KM0AKF138D8HASA97KD964HY
Run Details
Model:
GPT-5.4 Nano (Reasoning)
Test:
Dialogue tags
Scenario:
Write 200 words with 10% dialogue
Duration:
29.7s
Cost:
$0.003666
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
100.00%
Matches word count
(details)
99.99%
Dialogue to Total Word Ratio
(details)
99.9960%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 10% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Moonlight pooled on the daycare fence as Molly checked the banner for loose tape. Inside, a squeaky speaker played “Party Time,” and the hallway smelled of frosting and balloons. Bill crouched by the hedge, thumb on a cheap lockpick, ears tuned for parental footsteps. Their genius plan was to be polite thieves: wait for the gift-opening circle, then harvest every present before anyone noticed the wrapping paper getting thinner. Molly said, “Quiet, Bill.” Bill answered, “Got it.” Molly asked, “Keys?” Bill replied, “Try this.” Molly breathed, “Success!” Bill said, “Grab presents.” Molly warned, “No alarms.” Bill laughed, “Run!” Molly added, “Then wrap the mess.” Bill finished, “Kid asleep, saints.” They slipped through the gate, crossed the carpeted maze of gift bags, and filled their canvas sacks with brightly boxed loot. Somewhere, a clown practiced a high note, but it sounded like background music for crime. Molly counted, Bill grinned, and together they moved with the careful, absurd confidence of people who definitely shouldn’t know how to open childproof locks. They paused by the stage, listening to cheers swell, then slid a tower of presents into the shadows. Bill joked softly with his eyebrows as Molly bagged the last ribboned box.
created_at
"2026-03-18T11:16:57.007Z"
3/18/2026, 11:16:57 AM
model
"openai/gpt-5.4-nano-20260317"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
29016
moderation_latency
(null)
generation_time
29335
tokens_prompt
77
tokens_completion
2030
native_tokens_prompt
66
native_tokens_completion
2922
native_tokens_completion_images
0
native_tokens_reasoning
2632
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0036657
router
(null)
provider_responses
0
endpoint_id
"0c835f2e-c18d-4e8c-b245-e1e3bd08b97f"
id
"resp_0199315e81d85ee80169ba89a910108193b68165baff503cfe"
is_byok
false
latency
172
model_permaslug
"openai/gpt-5.4-nano-20260317"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773832617-Nqg38UPEyG01k10DjsCj"
upstream_id
"resp_0199315e81d85ee80169ba89a910108193b68165baff503cfe"
total_cost
0.0036657
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Matches word count
n/a
neededClean
false
words
200
99.99%
Dialogue to Total Word Ratio
Ratio: 10.95%, Deviation: 0.95%
neededClean
false
wordsTotal
201
wordsDialogue
22
99.9960%