NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-06-01KK1GR3N344BXJRA6694BHVKD
Run Details
Model:
GPT-5.4 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
27.1s
Cost:
$0.013470
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
44.17%
Dialogue to Total Word Ratio
(details)
22.0853%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly crouched behind the hedge with the solemn posture of a general studying a battlefield, though the battlefield in question involved a bouncy castle, six shrieking seven-year-olds, and a magician whose rabbit looked as if it had seen every crime in the county and approved of none of them. From the back garden of number fourteen came a storm of party music, the smell of sausage rolls, and the bright, terrible innocence of gift bags arranged on a long trestle table beneath a banner that read HAPPY BIRTHDAY OLIVER in letters large enough to taunt a person. Bill adjusted the tiny pair of binoculars he had stolen from a museum gift shop and frowned. “I’m just saying, Mol, there’s something spiritually dangerous about robbing a table guarded by a clown.” “It’s not a clown,” Molly whispered. “It’s a magician in a waistcoat.” “A clown in management,” Bill said. Molly lowered his binoculars with two fingers. She wore black from head to ankle, which would have been dramatic if not for the paper party hat she had picked up from the pavement and forgotten to remove. “Focus. We go in as caterers, cross the patio, load the presents into the laundry basket, and leave before they sing.” “You said there’d be no singing.” “I said we’d aim to avoid singing.” In the garden, a woman in a flowered apron carried out a bowl the size of a bathtub, and the children descended on it like piranhas with juice cartons. Bill watched them with deep suspicion. “They move fast.” “They’re six.” “Exactly. No sense of self-preservation. Sticky hands. Unpredictable loyalties.” Molly produced a folded sketch from her pocket. It was a careful map of the party, complete with arrows, timings, and one note that simply read DOG? underlined twice. “The presents are stacked by the patio doors. Primary obstacle: adults making small talk. Secondary obstacle: the birthday boy. Tertiary obstacle: glitter.” Bill peered at the sketch. “You’ve marked the cake as ‘possible distraction.’” “Everything’s a possible distraction if handled correctly.” A cheer rose from the garden. The magician had pulled a coin from a child’s ear. Bill winced. “He’s good. I don’t like professionals.” Molly checked her watch. “At three-ten the face painter packs up. At three-twelve the parents migrate toward the prosecco. At three-thirteen, we strike.” “And if someone asks who we are?” “We’re from Happy Tummies Events.” Bill considered that. “Too believable. I prefer a name with less legal exposure.” Molly sighed. “Bill.” He tugged at the white serving jacket stretched over his shoulders. “Fine. But if this goes wrong, I’m pretending to be an uncle.” “You cannot pull off uncle. You radiate burglar.” “Unfair. I can do friendly.” She stared. Bill tried a grin. It came out like a man apologizing with his teeth. “See?” From inside the hedge came a low growl. Both thieves froze. A small pug in a sequined collar waddled out of the leaves, stared up at them, and sneezed. Molly exhaled. “There’s the dog.” Bill looked offended. “That’s not a dog. That’s a compromised potato.” The pug barked once, with bureaucratic determination. “Bribe it,” Molly said. Bill patted his pockets, produced half a Scotch egg, and offered it with the diplomacy of a hostage negotiator. The pug inhaled it and immediately fell into step beside him like a newly hired accomplice. “Oh brilliant,” Molly muttered. “Now we’ve got a witness.” The garden gate clicked open. Music swelled. A child ran past in a pirate costume, pursued by two fairies and what appeared to be a dinosaur in spectacles. Molly straightened. “Right. Heads up. Shoulders back. Remember: confidence.” Bill lifted the empty laundry basket. “And if confidence fails?” “Run.” They stepped through the gate smiling the brittle smiles of people entering a social function for entirely the wrong reasons, while the pug trotted after them, already licking Scotch egg from its whiskers like a tiny, corrupt detective.
created_at
"2026-03-06T12:07:53.258Z"
3/6/2026, 12:07:53 PM
model
"openai/gpt-5.4-20260305"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
390
moderation_latency
(null)
generation_time
27055
tokens_prompt
77
tokens_completion
1106
native_tokens_prompt
66
native_tokens_completion
887
native_tokens_completion_images
0
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.01347
router
(null)
provider_responses
0
endpoint_id
"9ff5625c-403f-4d7f-b895-58ac7295062c"
id
"resp_0769f4d30b44adb70169aac3994e5081959fe711d0b341d066"
is_byok
false
latency
143
model_permaslug
"openai/gpt-5.4-20260305"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1772798873-Py68JAuObD0BNVPk7x7n"
upstream_id
"resp_0769f4d30b44adb70169aac3994e5081959fe711d0b341d066"
total_cost
0.01347
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
652
44.17%
Dialogue to Total Word Ratio
Ratio: 34.04%, Deviation: 4.04%
neededClean
false
wordsTotal
658
wordsDialogue
224
22.0853%