NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-18-01KM08SXPKC2A5KGVN9K97PN3T
Run Details
Model:
GPT-5.4 Mini (Reasoning, Low)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
5.4s
Cost:
$0.004221
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
99.95%
Dialogue to Total Word Ratio
(details)
49.9764%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
“Tell me again why we’re robbing a children’s party,” Bill whispered, peering through the alley like the cake might arrest him. “Because, Bill,” Molly said, tightening the black glove on her left hand, “the target is low-security, high-density, and emotionally reckless.” “You said that like a banker.” “I’m a professional.” “That’s not what professionals do.” “Depends on the profession.” Bill shifted the duffel bag on his shoulder. “This bag has no moral qualifications.” “Neither do you.” “That’s harsh.” “It’s accurate.” A burst of tinny music and shrieking laughter floated from the house at the end of the street. A banner over the porch read HAPPY BIRTHDAY, LIAM, in letters so enormous they looked legally binding. Bill squinted. “Are we sure about this? They’re children.” “Of course they’re children. That’s the point.” “The point of theft?” “The point of opportunity. Parents are distracted, cake is everywhere, and every present in there is practically begging to be liberated.” Bill frowned. “I don’t like the word liberated when it’s attached to a toy tractor.” “Then call it acquiring.” “Still sounds bad.” “Bill, do you want to retire rich or morally hydrated?” “I’d like one of those to feel less criminal.” Molly leaned closer to the hedge. “Listen carefully. We go in through the back gate. You flash the clipboards.” “The clipboards?” “Yes. The fake delivery clipboards.” “You made me carry four clipboards.” “Redundancy builds confidence.” “It also builds back pain.” “Inside, you smile.” “I hate smiling.” “You’re excellent at it when you’re lying.” “That’s insulting.” “It’s training.” Bill peeked again at the house. “There’s a clown.” “Excellent.” “Why is that excellent?” “Because clowns are either a distraction or a warning. Either way, useful.” “It’s eating a sausage roll.” “Then it’s a distraction.” “I don’t think the clown is supposed to be here.” “No one is supposed to be here, Bill. That’s what makes parties fun.” Bill stared at her. “You say things like a villain in a children’s cartoon.” “And yet, you’re still here.” He sighed. “All right. Remind me of the plan one more time, but simpler.” “Fine. We enter. You pretend to be from the Balloon Compliance Office.” “There’s no Balloon Compliance Office.” “Exactly. No one challenges authority they don’t understand.” “And then?” “And then we mingle. You ask where the presents are stored.” “Won’t that seem suspicious?” “With your face? Only if you’re telling the truth.” “Lovely.” “I’ll create a diversion.” “What kind?” “I’ll release the confetti cannon.” Bill blinked. “You brought a confetti cannon?” “I also brought smoke bombs, a spare bib, and a battery-operated penguin.” “Why the penguin?” Molly grinned. “For chaos.” “That’s not a reason.” “It is in my line of work.” The front door opened, and a parent in a paper crown stepped out carrying a tray of mini quiches. “Can I help you two?” Bill froze. “Uh.” Molly straightened instantly into the posture of a woman who had never committed a misdemeanor in her life. “Evening. We’re from the Balloon Compliance Office.” The parent narrowed their eyes. “The what?” Bill whispered, “I hate this already.” Molly smiled wider. “We’ve had a report of unsafe helium levels.” The parent looked from Molly to Bill, then to the duffel bag, then back to Molly. “That’s not a thing.” Molly nodded solemnly. “Exactly. Which is why we’re here.” Bill muttered, “She’s good.” “Thank you,” Molly said. The parent crossed their arms. “You’re not here for the presents, are you?” Molly paused just long enough to make Bill flinch. “Of course not,” she said. “We’re here for the cake.” Bill groaned. “That is somehow worse.” “Bill,” Molly said, adjusting the clipboards, “if we’re going to be caught, let’s at least be caught with frosting.”
created_at
"2026-03-18T10:45:31.489Z"
3/18/2026, 10:45:31 AM
model
"openai/gpt-5.4-mini-20260317"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
3120
moderation_latency
(null)
generation_time
4732
tokens_prompt
77
tokens_completion
1333
native_tokens_prompt
66
native_tokens_completion
927
native_tokens_completion_images
0
native_tokens_reasoning
35
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.004221
router
(null)
provider_responses
0
endpoint_id
"9ee065b2-3d1c-43bc-bdd7-28af3b148282"
id
"resp_0742e6a5b7dd7eac0169ba824b8bd08195b126a9c5dcf10fdd"
is_byok
false
latency
189
model_permaslug
"openai/gpt-5.4-mini-20260317"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773830731-rVi6QMM0XORXr2AAH7xt"
upstream_id
"resp_0742e6a5b7dd7eac0169ba824b8bd08195b126a9c5dcf10fdd"
total_cost
0.004221
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
615
99.95%
Dialogue to Total Word Ratio
Ratio: 69.90%, Deviation: 0.10%
neededClean
false
wordsTotal
618
wordsDialogue
432
49.9764%