NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-18-01KM0AB6QTGG1G3NMCCVWVR7FK
Run Details
Model:
GPT-5.4 Mini
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
5.7s
Cost:
$0.004603
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly crouched behind the bakery dumpster, studying the flyer she’d stolen from a lamppost. It showed a grinning cartoon dragon wearing a party hat and announced, in cheerful rainbow letters, “Jasper’s Seventh Birthday Spectacular!” Beneath that, in smaller print, the address, the time, and the line that made her eyes narrow with professional appreciation: Gifts to be stacked in the sunroom. Bill leaned over her shoulder and squinted. “So,” he said, “we are, in fact, robbing a child.” “We are recovering unattended merchandise,” Molly said. Bill blinked. “From a clown-themed living room?” “It’s a pirate theme,” Molly said. “Look at the flyer. Eyepatch on the dragon.” “That doesn’t help.” Molly folded the paper and tucked it into her jacket. “The point is, everyone will be distracted. Children, parents, one possibly sweating entertainer. We walk in, blend in, and leave with the presents.” Bill scratched his chin. “I don’t know. Stealing from adults is one thing. They’ve had years to develop cautious reflexes. Children have sticky hands and no concept of boundaries. It feels somehow unfair.” Molly stared at him. “Bill, we are career thieves.” “I know,” he said. “I’m just saying there’s a line.” “There is no line. There is only the sunroom.” He nodded slowly, as though she’d spoken in ancient wisdom. “Right. The sunroom.” Molly pulled two bright paper hats from her bag and held one out to him. “You’re going in as an uncle.” “An uncle?” “Uncles bring lopsided confidence and a smell of aftershave. Put it on.” Bill examined the hat. It was decorated with little gold stars. “I look like a regretful wizard.” “Perfect. Children trust wizards.” “No they don’t.” “They absolutely do. They also trust men carrying balloons and anyone who can whistle through their teeth.” Bill tried the hat on, and it perched above his ears with an insultingly festive tilt. “I hate this.” “You should,” Molly said. “If this works, we are professionals. If it doesn’t, we are the worst people at a birthday party since the uncle who cried into the punch bowl.” Bill peered down the alley toward the house. Music and shrieks drifted over the fence, along with the sugary scent of cake. “What about the presents? How many are we talking?” “Enough to make it worthwhile.” “Worthwhile for whom?” “For us.” He looked unconvinced. “Are we taking the bikes too? Because if the children are getting bikes, I draw the line at bikes.” “We’re taking whatever fits in the wagon,” Molly said. “Focus.” Bill brightened. “We have a wagon?” Molly pointed to the hedge. A battered red wagon sat there, waiting with criminal patience. “I borrowed it from the park.” “You stole a wagon for the job.” “I prefer ‘acquired.’” Bill admired it. “That is better. Still theft, though.” “Bill.” “Fine. Fine. I’m focusing.” Molly checked her watch. “We go in through the side gate at three. I’ll carry the party plates.” “I love that you know where the plates are.” “I read the neighborhood newsletter.” “You’re terrifyingly organized.” “I’m insultingly prepared.” The back door of the house opened, and a burst of laughter spilled out with a cloud of helium balloons. A woman in a sunhat called, “Who wants juice boxes?” and a chorus of children screamed as though she’d offered treasure. Bill swallowed. “You ever notice how birthday parties are just tiny riots with frosting?” “Exactly,” Molly said. “And rioters leave their valuables unattended.” He squared his shoulders. “All right. We go in, we smile, we act like family, we load the wagon, and we vanish.” Molly grinned. “That’s the spirit.” A little boy in a pirate vest raced across the yard, waving a plastic sword. “Hey! Are you the balloon people?” Bill froze. Molly, without missing a beat, reached into her bag and produced two twisted balloons she had somehow inflated earlier. “Of course,” she said brightly. “We’re very late.” The boy gasped. “Awesome.” Bill stared at her. “You brought balloons.” “I bring options.” The child grabbed one balloon and darted away. Molly took Bill by the elbow and steered him toward the gate. “Try not to look guilty.” Bill whispered, “I am wearing a party hat and carrying an illegal wagon.” “That’s not guilt. That’s commitment.” Together they slipped toward the house, where the presents waited in a glittering heap and the birthday song had just begun, loud enough to cover almost anything.
created_at
"2026-03-18T11:12:26.373Z"
3/18/2026, 11:12:26 AM
model
"openai/gpt-5.4-mini-20260317"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
408
moderation_latency
(null)
generation_time
5517
tokens_prompt
77
tokens_completion
1289
native_tokens_prompt
66
native_tokens_completion
1012
native_tokens_completion_images
0
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0046035
router
(null)
provider_responses
0
endpoint_id
"9ee065b2-3d1c-43bc-bdd7-28af3b148282"
id
"resp_09b30de7ce1f1fa50169ba889a6d1c8196a0f6822681651fb5"
is_byok
false
latency
195
model_permaslug
"openai/gpt-5.4-mini-20260317"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773832346-H6EwfCBs1ohkABEIlVkB"
upstream_id
"resp_09b30de7ce1f1fa50169ba889a6d1c8196a0f6822681651fb5"
total_cost
0.0046035
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
729
0.00%
Dialogue to Total Word Ratio
Ratio: 53.56%, Deviation: 23.56%
neededClean
false
wordsTotal
730
wordsDialogue
391
0.0000%