NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-18-01KM0AM1PRMTD2MRR3X8HVX0AP
Run Details
Model:
GPT-5.4 Mini
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
5.3s
Cost:
$0.004158
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
9.09%
Dialogue to Total Word Ratio
(details)
4.5447%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
“Bill,” Molly said, flattening the party flyer against the dashboard, “tell me this is a normal amount of balloons.” Bill squinted through the windshield at the suburban house ahead, where helium balloons bobbed in the evening air like deranged sherbet. “If by normal you mean an amount that looks capable of lifting the house into orbit, then yes.” Molly adjusted her black gloves. “Good. I was worried we’d been invited to a circus by mistake.” “We were not invited,” Bill said. “That’s the beauty of it.” “Bill, the beauty of it is supposed to be the presents.” He nodded toward the front window, where a paper banner announced HAPPY BIRTHDAY, TIMMY! in glittering letters. “Forty-three gifts, according to the reconnaissance.” “Forty-three?” Molly whispered. “How does a child acquire forty-three gifts?” “By being three, popular, and supported by a broad coalition of aunts.” Molly peered at the house. “This is a children’s party. There will be cake, tiny chairs, and at least one clown with the eyes of a man who’s seen too much.” “Not a clown,” Bill said. “A magician.” “A magician is just a clown with a union card.” Bill opened the glove compartment and took out two aprons, one pink and one blue, each printed with cartoon cupcakes. “We go in as hired help.” Molly stared. “You are proposing burglary by apron?” “Disguise, Molly. Subtlety. We blend in with the caterers.” “There is no caterer. This is a suburban backyard, not a royal wedding.” “Exactly,” he said. “No one questions a person carrying a tray.” She took the pink apron and held it up by two fingers. “This says ‘Birthday Boss.’” “It’s festive.” “It’s humiliating.” “It’s not like anyone will remember us.” Molly looked at him. “That’s the spirit. Say it louder, maybe the fingerprints will feel less nervous.” Bill glanced at his watch. “We have twelve minutes until the singing starts.” “Perfect. We slip in during the chaos, load the gifts into the delivery van, and vanish before the children finish smearing icing on the curtains.” “I love it when you talk operationally,” Bill said. “Makes crime sound like a professional service.” Molly slipped the aprons over her coat. “Because it is a professional service. We are not animals.” He leaned back, considering. “What if the presents are tagged?” “Then we untie the tags.” “What if they’re too big?” “We take the big ones first.” “What if there’s one of those enormous ride-on plastic trucks?” Molly smiled thinly. “Then I’m driving it.” Bill blinked. “You cannot drive a toy truck.” “You’ve never seen me after coffee.” A burst of singing rose from the backyard, enthusiastic and off-key. Molly and Bill ducked lower as a cluster of children stamped past the side gate carrying tiny cups. A small boy with frosting on his nose pointed at their van. “Delivery people!” he shouted. Bill froze. “Are we delivery people?” Molly straightened her apron. “Apparently, today we are.” The boy ran on, distracted by a balloon animal. Molly exhaled. “See? Invisible.” Bill grinned. “Children are terrible witnesses.” “They’re honest witnesses, though.” “True,” he said, opening the van door. “All right. On my signal, we enter, we charm, we confiscate.” Molly picked up an empty hamper. “And if anyone asks what we’re doing?” Bill handed her a roll of wrapping paper. “We say we’re here to collect the gifts for safe keeping.” She stared at him. He shrugged. “What? It sounds official.” Molly took one look at the glittering house, the swarming children, and the mountain of presents waiting like an unguarded treasure hoard. Then she smiled with the calm delight of a woman about to commit a very specific, very profitable sin. “Bill,” she said, “for once in your life, that might actually work.” He tipped an imaginary hat. “Then let’s rob a birthday.” Together they crossed the lawn, two counterfeit helpers in cupcake aprons, marching toward the music, the frosting, and the loot.
created_at
"2026-03-18T11:17:16.133Z"
3/18/2026, 11:17:16 AM
model
"openai/gpt-5.4-mini-20260317"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
405
moderation_latency
(null)
generation_time
5122
tokens_prompt
77
tokens_completion
1153
native_tokens_prompt
66
native_tokens_completion
913
native_tokens_completion_images
0
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.004158
router
(null)
provider_responses
0
endpoint_id
"9ee065b2-3d1c-43bc-bdd7-28af3b148282"
id
"resp_0fc3c2d7d80a7c6b0169ba89bc2e8c8190ac85bc42c720d80b"
is_byok
false
latency
180
model_permaslug
"openai/gpt-5.4-mini-20260317"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773832636-cTSfwvVvxDGXhdXmAsBP"
upstream_id
"resp_0fc3c2d7d80a7c6b0169ba89bc2e8c8190ac85bc42c720d80b"
total_cost
0.004158
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
654
9.09%
Dialogue to Total Word Ratio
Ratio: 56.93%, Deviation: 6.93%
neededClean
false
wordsTotal
657
wordsDialogue
374
4.5447%