NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-18-01KM08XE6VAM7F9AN6QAKPQ0V2
Run Details
Model:
GPT-5.4 Mini (Reasoning, Low)
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
5.3s
Cost:
$0.004194
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
3.31%
Dialogue to Total Word Ratio
(details)
1.6540%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly crouched behind the bakery dumpster with a clipboard balanced on one knee and the determination of a woman who had once stolen a goldfish from a church raffle and been deeply disappointed by the loot. Bill arrived three minutes late and carrying a folding ladder that looked as if it had already surrendered to life. “Tell me,” Molly said, not looking up, “why are you bringing a ladder to a children’s party?” Bill lowered it with a grunt. “Because I’m a visionary.” “You’re a man in a cardigan.” “Cardigans are the blazers of the desperate.” Molly marked a box on the clipboard. “The party starts at two. The presents will be in the conservatory, which is apparently what rich people call a room that sounds expensive and smells like raisins.” Bill peered around the corner of the alley toward the brightly decorated house. “There are balloons everywhere.” “That’s because it’s a children’s party.” “That many balloons is a sign of either joy or poor planning.” “Both, probably,” Molly said. “Now, listen carefully. We go in through the side gate, blend in, locate the gifts, and leave before anyone notices that two suspicious adults are carrying out thirty-seven wrapped parcels.” Bill frowned. “Why thirty-seven?” “Because I counted them.” “You counted the presents from the street?” “I have excellent eyesight and severe issues with trust.” Bill nodded with solemn respect. “Fair.” Molly folded the clipboard and tucked it into her coat. “We are not here for cake.” Bill sniffed the air. “Pity. I smell sponge.” “We are not here for party hats.” “I think I could look good in a cone.” “You’d look like a traffic incident.” “Cruel.” “It’s a heist. Cruelty is part of the branding.” Bill adjusted his gloves. “What’s the plan once we have the presents?” Molly gave him a look. “We steal them.” “No, I mean the exit.” “There’s always a back lane.” “And if there isn’t?” “There’s always a window.” “And if there isn’t?” Molly smiled thinly. “Then I start making decisions I won’t be proud of.” Bill blinked. “That’s your leadership style, is it?” “It’s my whole personality.” A burst of children’s laughter floated over the fence, followed by an adult voice crying, “No running indoors!” with the exhausted authority of someone who had already lost this battle before noon. Bill winced. “Do children ever run outdoors?” “Only when they’re plotting.” He leaned in. “What’s our cover story?” “We’re entertainers.” Bill looked down at his ladder. “Do entertainers usually carry ladders?” “Circus entertainers do. Sidewalk entertainers. Very niche.” “I’m not comfortable with niche.” “You don’t have to be comfortable. You have to look sincere.” Bill puffed out his chest. “I can do sincere.” “Your face says ‘man who owes money to a magician.’” “That’s because I do.” Molly closed her eyes briefly. “Please tell me not to ask.” “You absolutely should ask.” “I’m choosing peace.” “Good choice.” She checked her watch. “At two o’clock, the parents will be distracted by the magician and the small disaster near the trampoline.” “There’s a magician?” “Yes. Real one. Terrible reviews. Good timing.” Bill rubbed his hands together. “So while everyone’s clapping at rabbit nonsense, we slip in, gather the loot, and vanish.” Molly pointed at him. “Vanishing is key. If anyone asks, we were never here.” Bill grinned. “Excellent. I’ve always wanted to be a rumor.” From inside the house came a chorus of delighted shrieks and the unmistakable sound of a cake meeting gravity. Molly listened, then stood and straightened her coat. Bill lifted the ladder. “Ready?” “No,” Molly said. “But I’m dressed for optimism.” He nodded toward the gate. “After you, then.” Molly set off with the calm, purposeful stride of someone who had once been mistaken for a school inspector. Bill followed, carrying the ladder like a flag of surrender, and together they headed toward the party with the confidence of professionals and the grammar of disasters.
created_at
"2026-03-18T10:47:26.697Z"
3/18/2026, 10:47:26 AM
model
"openai/gpt-5.4-mini-20260317"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1888
moderation_latency
(null)
generation_time
5114
tokens_prompt
77
tokens_completion
1307
native_tokens_prompt
66
native_tokens_completion
921
native_tokens_completion_images
0
native_tokens_reasoning
37
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.004194
router
(null)
provider_responses
0
endpoint_id
"9ee065b2-3d1c-43bc-bdd7-28af3b148282"
id
"resp_0211331a4147997d0169ba82bec4d4819687e7a7dce69a1214"
is_byok
false
latency
199
model_permaslug
"openai/gpt-5.4-mini-20260317"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773830846-cQMEVBH38NFWRH49ONUh"
upstream_id
"resp_0211331a4147997d0169ba82bec4d4819687e7a7dce69a1214"
total_cost
0.004194
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
652
3.31%
Dialogue to Total Word Ratio
Ratio: 58.26%, Deviation: 8.26%
neededClean
false
wordsTotal
654
wordsDialogue
381
1.6540%