NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-18-01KM08B2BX6ZM5V7322NJ7ZPRP
Run Details
Model:
GPT-5.4 Mini (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
7.2s
Cost:
$0.005242
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly crouched behind the hedge across from the community hall and narrowed her eyes at the banner strung over the door. HAPPY BIRTHDAY, TILLY, it said in blue letters the size of sausages. Bill stood beside her in a coat too shiny to be trusted and a hat that looked like it had survived several bad decisions. “I’m just saying,” he whispered, “this feels morally slippery.” Molly didn’t look away from the hall. “We’re thieves, Bill. Our moral footing has been a puddle since Tuesday.” “That’s harsh.” “It’s accurate.” Inside, the children’s party glowed with paper lanterns and the kind of cheerful chaos only eight-year-olds and sugar could create. Through the window they could see a mountain of presents gathered on a table beneath a cardboard castle. Bill swallowed. “There are a lot of gifts.” “There are a lot of small humans too,” Molly said. “So we don’t linger. In, out, elegance, no footprints.” Bill stared at her. “We are not elegant people.” “We can fake it for twenty minutes.” He gave the hall a doubtful look. “And what exactly is the plan? Because last time your ‘plan’ was just ‘wear confidence and walk quickly.’” “And it worked.” “We were carrying a turkey we hadn’t paid for.” “That’s because the turkey was in the way.” Bill folded his arms. “I still think stealing from a party of children is a low point.” Molly turned to him, offended on principle. “Bill, please. It’s not stealing from children. It’s redistributing festive assets.” He blinked. “That sounds worse, somehow.” “It sounds professional.” A burst of laughter spilled out from the hall as a clown in a sequined vest twisted a balloon into something that resembled both a sword and a goose. Bill winced. “You didn’t mention there’d be a clown.” “I didn’t know there’d be a clown.” “You said you had information.” “I had a source.” “Who?” Molly hesitated. “The man who runs the sandwich van.” Bill’s face fell. “You took intelligence from a sandwich man?” “He was very convincing. He said, and I quote, ‘There’ll be a clown, but he’s more sad than dangerous.’” Bill watched the clown hand a balloon dog to a child who immediately popped it by hugging it too hard. “That man was right.” Molly peered at the table of gifts again, then smiled with the serene confidence of someone who had never once been prevented from making a terrible choice. “Look, the window of opportunity is perfect. The party’s in full swing, the adults are distracted, and every child in there is either occupied with cake or screaming at a balloon. We stroll in, collect the parcels, and vanish before anyone notices.” Bill squinted at her. “You say that like we’re borrowing umbrellas.” “Exactly. Only the umbrellas are covered in tape and bows.” “You are deeply committed to making crime sound cheerful.” “It helps with morale.” A small boy in a pirate hat raced past the window, shouting, “The jellyfish is on the ceiling!” Behind him, a woman in a glittery cardigan hurried after him with a tray of sausage rolls. Bill pointed. “That child just said there’s a jellyfish on the ceiling.” Molly frowned. “That’s not normal.” “Nothing about this is normal.” “Fine,” she said. “That’s why it’s a perfect distraction.” He turned to her slowly. “Molly, if there is an actual jellyfish in there, we are leaving.” “There isn’t an actual jellyfish.” “Good.” “It’s probably a hanging decoration.” Bill stared at the brightly lit hall, then at the presents, then at Molly. “You have a very specific relationship with the word ‘probably.’” Molly patted the inside pocket of her coat where a pair of gloves sat neatly folded. “Bill, I’m asking for one clean entrance, one graceful exit, and no emotional attachments to any wrapping paper.” He sighed the sigh of a man who had been outvoted by destiny. “All right. But if anyone asks, I was blackmailed.” Molly grinned. “By who?” “The clown,” Bill said. “He has a look.” Together they stepped from the hedge and approached the hall with the air of two people attending a very odd wedding. Inside, the music swelled, the children shouted, and the pile of presents waited under the paper castle like treasure under bad supervision. Molly nudged the door open and whispered, “Remember: confidence.” Bill adjusted his ridiculous hat and muttered, “I hate that I’m good at this.”
created_at
"2026-03-18T10:37:24.746Z"
3/18/2026, 10:37:24 AM
model
"openai/gpt-5.4-mini-20260317"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
2212
moderation_latency
(null)
generation_time
7027
tokens_prompt
77
tokens_completion
1254
native_tokens_prompt
66
native_tokens_completion
1154
native_tokens_completion_images
0
native_tokens_reasoning
176
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0052425
router
(null)
provider_responses
0
endpoint_id
"9ee065b2-3d1c-43bc-bdd7-28af3b148282"
id
"resp_0e23d874ec1493d40169ba8064cd308195b9ef792b2817729c"
is_byok
false
latency
182
model_permaslug
"openai/gpt-5.4-mini-20260317"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773830244-Kr4TEuU06Km9woTH0M7s"
upstream_id
"resp_0e23d874ec1493d40169ba8064cd308195b9ef792b2817729c"
total_cost
0.0052425
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
731
0.00%
Dialogue to Total Word Ratio
Ratio: 50.20%, Deviation: 20.20%
neededClean
false
wordsTotal
733
wordsDialogue
368
0.0000%