NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-18-01KM07E35RKBJAB7R73K0KN8RZ
Run Details
Model:
GPT-5.4 Mini
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
5.4s
Cost:
$0.004338
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly flattened herself behind the bakery van and peered at the rented marquee glowing at the end of the lane. Paper stars twitched in the breeze. A painted sign over the gate read HAPPY BIRTHDAY, TOMMY! in letters so cheerful they felt like an insult. Bill unfolded a crumpled map with the solemnity of a general invading a duchy. “There it is,” he whispered. “The target.” Molly snorted. “You say target like we’re robbing a bank, not a room full of sugar-addled six-year-olds.” “A party is a bank,” Bill said. “A bank of gifts.” “Awful,” she said. “You should be ashamed.” “I am. Deeply. Now look here.” He jabbed a finger at the map. “Table of presents goes in the main hall, near the clown.” “There’s a clown?” Molly leaned out farther. “That’s new information.” “Apparently.” “Bill, why is it always ‘apparently’ with you? You’re the planner.” “I’m a planner, not a prophet.” From inside the marquee came a burst of shrieking laughter, followed by the unmistakable squeal of a balloon meeting its maker. Molly winced. “Any children screaming yet?” “Not in pain, I hope,” said Bill. “Don’t be precious. Children scream for everything. It’s their gift.” He looked at her. “You really don’t like kids.” “I like them fine from a distance. Like wolves. Or tax accountants.” Bill folded the map and tucked it into his coat. “Right. We go in through the catering entrance, pose as entertainers, and you distract the adults while I wheel out the presents.” Molly raised an eyebrow. “I distract adults?” “You do your thing.” “My thing?” “You know.” He mimed an exaggerated flourish. “Charming, improvising, telling lies with confidence.” “That is a horrible summary of my talents.” “An accurate one, then.” She peered toward the side of the marquee where a sleepy man in a waistcoat was unloading trays of sausage rolls. “What if they ask for identification?” Bill blinked. “At a child’s birthday party?” “People become suspicious when money is involved. And presents are expensive. These people will be guarding those boxes like crown jewels.” He smiled. “That’s why we’re professionals.” “We’re not professionals,” Molly said. “We’re amateurs with good shoes.” “Good shoes are half the battle.” She sighed and adjusted the satchel slung across her shoulder. “Fine. But if I’m pretending to be anything, I want a costume.” Bill’s eyes brightened. “I brought options.” “You did not.” He reached into the van and produced a folded bundle of fabric covered in glittering stars. Molly stared. “It’s a magician’s cape,” he said. “Borrowed.” “Borrowed from whom?” “From a magician.” “That doesn’t answer anything.” “It answers enough.” Molly took the cape, shook it once, and a shower of silver sequins hit the ground like tiny dishonest coins. “This is embarrassing.” “It’s bold.” “It’s humiliating.” “It’s theatrical.” “It’s going to get us arrested.” Bill looked genuinely offended. “Arrested? Molly, please. We’re stealing presents from children, not the crown of England.” “Some people would say that’s worse.” He checked his pocket watch. “We have ten minutes before the birthday song. Once they’re all distracted by cake, we make our move.” “And the presents?” “Walk them out to the van. Quick, quiet, elegant.” Molly tied the cape around her neck and looked down at herself. “Elegant is not the word.” “What word would you use?” She grinned, sharp and wicked. “Ridiculous.” Bill nodded toward the marquee. “Good. Ridiculous people are never suspected.” Together they moved toward the side entrance, where a cardboard cutout of a smiling bear stood guard beside a stack of balloons. Inside, voices rose and fell in bright waves, all frosting and chaos. Molly inhaled, squared her shoulders, and stepped into the light. “Remember,” she murmured, “if anyone asks, we are absolutely supposed to be here.” Bill gave her a thumbs-up. “Naturally.” “And if this goes wrong?” He glanced at the gift table visible through the opening, where ribbons gleamed beneath a cluster of fairy lights. “Then we improvise.” Molly bared her teeth in a grin. “Now that,” she said, “sounds like my sort of crime.”
created_at
"2026-03-18T10:21:35.3Z"
3/18/2026, 10:21:35 AM
model
"openai/gpt-5.4-mini-20260317"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
370
moderation_latency
(null)
generation_time
5038
tokens_prompt
77
tokens_completion
1237
native_tokens_prompt
66
native_tokens_completion
953
native_tokens_completion_images
0
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.004338
router
(null)
provider_responses
0
endpoint_id
"9ee065b2-3d1c-43bc-bdd7-28af3b148282"
id
"resp_0903a52b2280b8f10169ba7caf5a9481968309f4c9672561f3"
is_byok
false
latency
175
model_permaslug
"openai/gpt-5.4-mini-20260317"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773829295-2W2vCZPXbceKTggZjlnJ"
upstream_id
"resp_0903a52b2280b8f10169ba7caf5a9481968309f4c9672561f3"
total_cost
0.004338
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
673
0.00%
Dialogue to Total Word Ratio
Ratio: 53.18%, Deviation: 23.18%
neededClean
false
wordsTotal
677
wordsDialogue
360
0.0000%