NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-18-01KM0994BG3VGZX6S43X0WQY7E
Run Details
Model:
GPT-5.4 Mini (Reasoning, Low)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
5.1s
Cost:
$0.004140
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly flattened herself against the rain-slick alley wall and unfolded the party flyer for the fifth time, as if the picture of a grinning cartoon bear might reveal a trapdoor. Across the alley, Bill stood under a flickering streetlamp, wearing the kind of expression usually reserved for men who have just discovered a moth in their soup. “Tell me again,” he said, “why we’re robbing a children’s party.” “Not robbing,” Molly said. “Liberating. There’s a difference.” Bill stared. “Does the difference involve felony?” “It involves holiday spirit.” “It’s a birthday party.” “Exactly. The child has spirit. We’re just redistributing the presents.” Bill adjusted his gloves, which were a size too big and had tiny rubber ducks on them. “I still think a bank would be easier.” “A bank has locks, cameras, lasers, and adults with pensions,” Molly said. “This place has balloons, sugar, and people who leave handbags unattended because they’re distracted by a clown on stilts.” Bill looked back at the flyer. “It says here there’s a bouncy castle.” “Yes.” “How is that helpful?” “It isn’t. I just thought you’d enjoy knowing the security system includes an inflatable bounce surface.” He sighed with the heavy patience of a man who had once been talked into smuggling a peacock through customs. “What exactly is the plan?” Molly smiled, and when Molly smiled, it usually meant one of two things: either she had a plan, or she was about to improvise in a way that would make insurance companies weep. “We arrive as entertainers.” Bill blinked. “Entertainers?” “You’re the magician.” “I’m not a magician.” “You own a top hat.” “It’s a hat.” “And I’m the balloon animal specialist.” “You can’t make balloon animals.” “I can make one. It will be upsetting, but technically recognizable.” Bill folded his arms. “Why not just go as catering staff?” “Because caterers are watched. Entertainers are forgiven. If a clown walks into a room, people assume he belongs there. If a man in a black vest walks in, somebody asks whether he’s there to fix the boiler or commit a burglary.” Bill looked down at his vest. “I do look boiler-adjacent.” “Exactly.” A gust of wind sent a flyer skittering past their shoes. Molly snatched it up and tapped the listed prizes with one finger. “Look at this. Thirty-seven wrapped gifts. One giant teddy bear. A remote-controlled dinosaur.” “We’re stealing from a child.” “We’re stealing from the parents’ sense of perspective. The child just happens to be collateral.” Bill winced. “You make it sound very professional.” “It is professional. We’ll blend in, wait for the cake, then during the speeches, you distract the adults with a card trick.” “I don’t know card tricks.” “You know how to make cards disappear.” “That was one time, and it was at a wedding.” “Perfect. Practice.” “I’m not doing a heist based on my ability to accidentally lose a deck.” Molly leaned closer, lowering her voice as a stroller rolled past at the mouth of the alley. “Bill, listen. This is a clean job. No harm, no fuss. We take the presents, walk out with the wrapping paper, and by the time anyone notices, we’re three suburbs away and selling toy robots to antique dealers.” “And if the kids cry?” Molly paused. “Then we give them the balloons. We’re thieves, not monsters.” Bill considered this, then gave a reluctant nod. “Fine. But if I’m the magician, you have to be the one who says the magic words.” “Gladly.” “What are they?” She slipped the flyer into her coat and grinned toward the glow of the party house at the end of the lane, where colored lights bobbed in the windows like cheerful little warning signs. “Ladies and gentlemen,” Molly said, buttoning her coat, “prepare to witness the most generous vanishing act in neighborhood history.”
created_at
"2026-03-18T10:53:49.822Z"
3/18/2026, 10:53:49 AM
model
"openai/gpt-5.4-mini-20260317"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
2454
moderation_latency
(null)
generation_time
4786
tokens_prompt
77
tokens_completion
1221
native_tokens_prompt
66
native_tokens_completion
909
native_tokens_completion_images
0
native_tokens_reasoning
61
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.00414
router
(null)
provider_responses
0
endpoint_id
"9ee065b2-3d1c-43bc-bdd7-28af3b148282"
id
"resp_01eda57d75fd042e0169ba843ddf808193b7601f13397dd0e1"
is_byok
false
latency
174
model_permaslug
"openai/gpt-5.4-mini-20260317"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773831229-BKQK9YYrborEwnbK9Q0W"
upstream_id
"resp_01eda57d75fd042e0169ba843ddf808193b7601f13397dd0e1"
total_cost
0.00414
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
634
0.00%
Dialogue to Total Word Ratio
Ratio: 61.60%, Deviation: 31.60%
neededClean
false
wordsTotal
638
wordsDialogue
393
0.0000%