NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-18-01KM09493A4W79SM8M7WA93QN4
Run Details
Model:
GPT-5.4 Mini (Reasoning, Low)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
5.5s
Cost:
$0.004450
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
3.05%
Dialogue to Total Word Ratio
(details)
1.5246%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
“Bill, if you say ‘easy job’ one more time, I’m leaving you to explain this to the cops and the balloon animals.” “Easy job is a phrase of optimism, Molly.” “It’s a phrase of unemployment.” “Look, it’s a children’s party. Tiny guests. Cake. Music. Very low security.” “Bill, there is a clown by the gate.” “A clown is not security.” “A clown is absolutely security if he can juggle and recognize faces.” “That clown looks emotionally exhausted.” “That’s because he’s seen you in a henley and a cap before.” Bill peered around the hedge, adjusting the fake moustache he’d bought from a shop called Party Antics and Regrets. “We go in, blend in, and take the presents.” Molly stared at him. “You said ‘blend in’ while holding a clipboard that says Property Inspection.” “It’s official-looking.” “It says ‘property’ in glitter pen.” “It’s festive.” “It’s incriminating.” “Fine. Then you handle the social side. I’ll handle the logistics.” “What logistics?” “The presents are stacked on a folding table inside the marquee. We walk in, exchange a few pleasantries, load the gifts into the van, and leave.” “With the children watching?” “With a bit of confidence, no one notices anything.” Molly snorted. “Bill, last time you said that, we left a museum through the gift shop and got chased by a man in a knight costume.” “Exactly. Nobody ever suspects a knight.” “You were the one in the knight costume.” “And I got away.” “You lost one shoe and a very specific sense of dignity.” Bill leaned closer to the hedge opening. “We have a window. The parents are all distracted by the magician.” “The magician is sawing a volunteer in half.” “The volunteer is smiling.” “That’s because he’s been paid in cupcakes.” Bill tapped the side of the van parked across the street. “We’ve got the van, we’ve got gloves, we’ve got labels for the boxes.” “Why labels?” “Because we are not criminals, Molly. We are organized professionals.” “That’s worse somehow.” A burst of children’s laughter rose from the garden, followed by a squeal and someone shouting, “It’s the dinosaur pinata again!” Molly blinked. “Did they say again?” “Let’s not dwell on that.” “You always say that right before something illegal and sticky.” “We are not stealing sticky things.” “We are stealing gifts from a child’s birthday.” “Technically many children’s birthdays.” “Bill.” “Sorry. Shared birthday experience.” Molly rubbed her forehead. “Tell me the plan one more time, slowly, like I’ve had a concussion from a low-flying clown.” Bill straightened. “We enter through the side gate. You distract the parent at the drink table.” “With what?” “With charm.” “I have no charm. I have a face that makes people ask if I’ve just received bad news.” “Then give them bad news.” “What does that mean?” “Say the cake is on fire.” “The cake is not on fire.” “It will be a conversational fire, Molly. A metaphorical one.” “And while everyone panics?” “I roll the presents to the van.” “They’re wrapped.” “Even better. Festive camouflage.” “And if the kids ask where their gifts went?” Bill shrugged. “We tell them the birthday fairy is doing redistribution.” “Bill, that is not a thing.” “It is if you say it with enough confidence.” Molly looked back at the house, the streamers, the squealing children, the enormous cardboard castle by the lawn. “You know what? I hate how practical this sounds.” Bill grinned. “That’s the spirit.” “That is not spirit. That is a cry for help in a nice jacket.” “You coming or not?” Molly sighed, cracked her knuckles, and picked up the empty sack. “If I get glitter in my shoes, I’m blaming you forever.” “You already blame me forever.” “True. But this time it’ll be with purpose.” “Excellent,” Bill said, nudging the gate open. “Now remember: calm, polite, and invisible.” Molly looked at the clown, the magician, the parents, and the towering pile of presents. “Bill?” “Yes?” “We are absolutely going to be remembered.” Bill swallowed. “By name?” “By smell.” Then they stepped into the party together, smiling like seasoned guests and walking like two people who had never once made a wise decision in their lives.
created_at
"2026-03-18T10:51:10.838Z"
3/18/2026, 10:51:10 AM
model
"openai/gpt-5.4-mini-20260317"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
2475
moderation_latency
(null)
generation_time
5272
tokens_prompt
77
tokens_completion
1400
native_tokens_prompt
66
native_tokens_completion
978
native_tokens_completion_images
0
native_tokens_reasoning
32
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0044505
router
(null)
provider_responses
0
endpoint_id
"9ee065b2-3d1c-43bc-bdd7-28af3b148282"
id
"resp_025efab6d2b92f070169ba839ee99c8190bce7bbcfe3587d42"
is_byok
false
latency
204
model_permaslug
"openai/gpt-5.4-mini-20260317"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773831070-0jPG32vCQiIzAPTwVjo6"
upstream_id
"resp_025efab6d2b92f070169ba839ee99c8190bce7bbcfe3587d42"
total_cost
0.0044505
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
691
3.05%
Dialogue to Total Word Ratio
Ratio: 78.35%, Deviation: 8.35%
neededClean
false
wordsTotal
693
wordsDialogue
543
1.5246%