NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-18-01KM07D5GNZQSW877WPXWCE4HZ
Run Details
Model:
GPT-5.4 Mini (Reasoning, Low)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
6.0s
Cost:
$0.004594
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly crouched behind the bakery dumpster, peering through the steam that drifted from the alley like the breath of an exhausted dragon. Across the street, the McNulty townhouse glowed with paper lanterns and the neon promise of chaos. A banner stretched over the front steps: HAPPY BIRTHDAY, ELLIE! in letters large enough to guide ships home. Bill adjusted his black knit cap, which had a tiny rhinestone skull sewn on the front. “You know, every time we do one of these jobs, I think, ‘This is the one where we become respectable citizens.’” Molly snorted. “And every time, you say that right before suggesting a crime against party supplies.” “This is not against party supplies,” Bill said. “This is an elegant redistribution of gift wealth.” “Call it what you like,” Molly said, checking the street. “We’re stealing presents from a room full of sugared-up children.” “Technically,” Bill said, “they’re not theirs yet. They’re sitting in a pile by the piano. Ownership is a philosophical debate.” Molly gave him a flat look. “You ever wonder why the universe keeps pairing me with you?” “Chemistry,” Bill said. “Also your excellent taste in masks.” She held up the plastic fox mask tucked under her arm. “It was the least ridiculous one at the novelty shop.” “And yet,” Bill said, producing a rubber chicken from his coat, “I have committed to the theme.” Molly groaned. “If you pull that out inside the house, I’m leaving you there.” “You wouldn’t.” “I would absolutely leave you to be adopted by kindergarteners.” A burst of laughter floated from the townhouse, followed by the high note of a child shrieking with delight. Molly leaned forward and studied the windows. “They’ve got the side door propped open for deliveries. Caterer’s coming and going. That’s our entry.” Bill glanced at the stack of wrapped boxes visible through the curtain. “How many presents?” “Enough to make even you pause.” “Impossible.” “You’re forgetting the turquoise bike with streamers.” Bill went still. “There’s a bike?” “There’s a bike.” Bill clasped his hands. “Molly, for the first time in my criminal career, I feel joy.” “Focus. We get in with the pastry trays, blend into the catering staff, and during the clown’s balloon segment, we move the gifts to the getaway van.” Bill blinked. “There’s a clown segment?” “Apparently.” “That’s too many moving parts.” “That’s why we’re wearing uniforms.” “Wait,” Bill said, frowning, “what uniforms?” Molly lifted a garment bag from the dumpster lid and unzipped it. Two immaculate white aprons and paper caps hung inside, smelling faintly of cinnamon and villainy. “Baker’s assistants. No one questions bakery people. We carry cake, they look away. It’s a social contract.” Bill stared. “You stole these from an actual bakery.” “I borrowed them permanently.” He sighed with admiration. “That’s why you’re the strategist.” “That’s why you’re the muscle.” Bill flexed his wrist. “I’m more of a persuasive presence.” “You tripped over a welcome mat last week.” “It was a hostile mat.” Molly checked her watch. “We have twelve minutes. Once the magician starts, all adults will be distracted by disposable rabbits and applause. We go in, collect the gifts, out through the kitchen, load the van, disappear.” “And the children?” Bill asked. Molly shrugged. “They’ll be busy eating frosting and opening one present at a time like the laws of nature still matter.” Bill grinned under his mask. “Marvelous. A flawless plan.” Molly looked at him. “You’re being sarcastic.” “Only because I’m nervous.” “That’s fair.” For a moment they listened to the party noises: music, laughter, the tinny thump of a song about animals doing the conga. Somewhere inside, a kid screamed with joy or terror; in a children’s party, the distinction was ornamental. Molly straightened her apron and gave the townhouse one final appraisal. “All right. On my mark, we become invisible.” Bill hefted the rubber chicken like a baton. “And if anyone asks?” “We’re delivering cakes.” “What if they want proof?” Molly smiled. “Then we improvise.” She stepped from the alley, Bill at her shoulder, both of them walking toward the bright, ridiculous house with the solemn determination of professionals entering a very strange battlefield.
created_at
"2026-03-18T10:21:04.928Z"
3/18/2026, 10:21:04 AM
model
"openai/gpt-5.4-mini-20260317"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
3117
moderation_latency
(null)
generation_time
5742
tokens_prompt
77
tokens_completion
1359
native_tokens_prompt
66
native_tokens_completion
1010
native_tokens_completion_images
0
native_tokens_reasoning
54
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0045945
router
(null)
provider_responses
0
endpoint_id
"9ee065b2-3d1c-43bc-bdd7-28af3b148282"
id
"resp_0f5b78c85b7724190169ba7c90fe8c81978f91aa46443e97c9"
is_byok
false
latency
178
model_permaslug
"openai/gpt-5.4-mini-20260317"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773829264-KN2p5C4HZdZAlmt12XaY"
upstream_id
"resp_0f5b78c85b7724190169ba7c90fe8c81978f91aa46443e97c9"
total_cost
0.0045945
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
691
0.00%
Dialogue to Total Word Ratio
Ratio: 55.78%, Deviation: 25.78%
neededClean
false
wordsTotal
692
wordsDialogue
386
0.0000%