NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-18-01KM09HJWE7NTNNNENJTK9D1M1
Run Details
Model:
GPT-5.4 Mini (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
5.7s
Cost:
$0.004603
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The balloon arch at the end of Bramble Lane looked like a carnival had swallowed a rainbow and then politely vomited it onto Mrs. Pritchard’s front garden. From the cracked upstairs window of the house opposite, Molly watched the stream of parents, children, and one suspiciously muscular clown all drift toward the party like moths with sugar addictions. Bill crouched beside her, holding a pair of opera glasses he had definitely not needed and definitely insisted on bringing. “Tell me again why we’re robbing a party full of eight-year-olds,” he whispered. “Not robbing,” Molly said. “Recovering unsecured assets.” Bill squinted at her. “That’s robbing with a library card.” She ignored that and tilted her chin toward the table beneath the gazebo. Wrapped boxes glittered in towers of red, blue, silver, and one particularly aggressive paper covered in dinosaurs wearing sunglasses. “There. The presents.” Bill let out a slow breath. “There are a lot.” “A party this size always has a haul. Birthday money, fancy toys, a questionable amount of scented slime. We take the lot, we’re out before the cake gets cut, and by Monday everyone assumes the older cousin did it.” “What if there isn’t an older cousin?” “Then we improvise one.” Below, a child in a pirate hat shrieked with delight as a magician produced a rabbit from a tea kettle. Molly smiled despite herself. “Perfect distraction. The clown’s busy juggling oranges, the parents are busy pretending they like the homemade quiche, and the birthday boy is opening a remote-control fire engine. No one will miss a few presents.” Bill lowered the glasses. “You’ve thought about this a lot.” “I like to be prepared.” “You also like to be dramatic.” “Dramatic people get remembered. Invisible people get caught.” He nodded, as if this was fair, though his expression remained uneasy. “And the plan is?” “Simple. We go in as entertainment.” Bill blinked. “We are not entertainment.” “We are if we carry a clipboard and look disappointed.” “I don’t look disappointing.” “You look like a man whose blood type is tea.” He stood a little straighter. “That’s not even a real insult.” “It’s the insult of the future,” Molly said. “You’re my catering assistant. I’m the emergency balloon technician. We ask for access to the storage room, claim we’re setting up a surprise helium effect, and while everyone’s distracted by me pretending to know how balloons work, you wheel the presents out in the laundry cart.” Bill looked down at the battered cart leaning against the wall. “You stole this from a hotel.” “I borrowed it from a hotel’s back alley.” “You took it from a hotel’s back alley.” “Bill, focus. After that, we cross to the van, change plates, and vanish before the children start comparing Pokémon cards and noticing the gifts are gone.” He scratched his chin. “And if we’re spotted?” Molly grinned. “Then we panic beautifully.” That, at least, seemed to comfort him. He gave the party one last glance, then reached into his coat and produced two lanyards. “I made these.” She took one. Printed in bold, official-looking letters, it said HELPER. The second one said BILL. She frowned. “Why does yours just say your name?” “I ran out of time.” “You had three days.” “Then I ran out of competence.” Molly laughed, a bright little sound lost beneath the thump of party music. She slipped the lanyard over her head and tucked her hair behind one ear. Down below, the clown blew a trumpet and half the children screamed with joy. “Ready?” she asked. Bill inhaled, squared his shoulders, and nodded toward the glittering garden. “For honesty’s sake, no.” “Excellent,” Molly said. “Then you’re in the right line of work.” Together they climbed down the stairs and stepped into the afternoon, two professionals in borrowed confidence, headed toward a gazebo full of presents and a room full of people who had absolutely no idea their birthday was about to become extremely memorable.
created_at
"2026-03-18T10:58:26.844Z"
3/18/2026, 10:58:26 AM
model
"openai/gpt-5.4-mini-20260317"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1313
moderation_latency
(null)
generation_time
5485
tokens_prompt
77
tokens_completion
1121
native_tokens_prompt
66
native_tokens_completion
1012
native_tokens_completion_images
0
native_tokens_reasoning
150
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0046035
router
(null)
provider_responses
0
endpoint_id
"9ee065b2-3d1c-43bc-bdd7-28af3b148282"
id
"resp_0afd3439c9cb55e30169ba8552eb488190bac53de04c84be51"
is_byok
false
latency
223
model_permaslug
"openai/gpt-5.4-mini-20260317"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773831506-QfuWpJ8t8mxklWW9b3TP"
upstream_id
"resp_0afd3439c9cb55e30169ba8552eb488190bac53de04c84be51"
total_cost
0.0046035
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
658
0.00%
Dialogue to Total Word Ratio
Ratio: 50.15%, Deviation: 20.15%
neededClean
false
wordsTotal
662
wordsDialogue
332
0.0000%