NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-18-01KM0ARGK69PKXQH6Z3XA67VW3
Run Details
Model:
GPT-5.4 Mini
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
5.4s
Cost:
$0.004563
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
8.34%
Dialogue to Total Word Ratio
(details)
4.1723%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
“Tell me again why the children’s party is the easiest job in town,” Molly whispered, peering through the alley window at the burst of balloons inside. “Because,” Bill said, tugging his gloves tighter, “everyone assumes the idiot with the cake is harmless.” “You are the idiot with the cake?” “I am tonight,” Bill said. “A baker, technically.” “You look like a man who lost a fight with a flour bin.” “That is called commitment. Besides, the disguise is excellent.” Molly squinted at him. “You’ve got a fake mustache made of frosting.” “It’s artisanal.” “It’s melting.” “So are my standards.” Inside the hall, someone shrieked with delight. A clown’s trumpet wheezed. Molly adjusted the strap of her satchel. “All right. We slip in, mingle, locate the gift table, and leave before anyone notices the presents are gone.” Bill nodded. “Clean, quiet, elegant.” “Why are you holding a toy sword?” Bill looked down. “It was in the costume bin.” “You’re a baker.” “I’m a festive baker.” “Why is there a toy sword in the cake box?” “I didn’t ask questions when the box said, ‘For special occasions.’” Molly rubbed her temple. “Focus. The plan.” “The plan,” Bill repeated, very solemnly, “is to exploit the fact that children cannot count.” “That’s not the plan.” “It is if they’re opening presents. One child gets distracted, another starts crying, adults begin negotiating with frosting, and then we take the lot.” “The lot,” Molly said, “being every single present in the room.” “Every single present,” Bill confirmed. “A mountain of wrapped joy. We’ll be rich.” “We’ll be chased.” “Semantics.” Molly leaned closer to the window. “There’s a table by the stage. I can see pirate paper, unicorn paper, and one box the size of a coffin.” “Probably a train set.” “Or a tiny horse.” “Even better. Horses are expensive.” From inside came the sound of a host calling, “Who wants cake?” followed by a roar of tiny voices. Bill brightened. “That’s our cue.” “That’s dessert.” “Yes, but dessert is what lures the adults away from the gifts.” “You’ve done this before.” “I’ve studied party behavior.” “While in prison?” “Especially while in prison.” Molly opened the back door a fraction. Warm air, sugar, and chaos spilled out. “Remember, we are invisible. We are caterers.” Bill glanced at her apron. “You’re wearing a black ski mask.” “Caterers wear whatever they want.” “That is not true.” “It is for this company.” He straightened the fake frosting mustache with one finger. “What’s our cover story if someone asks?” Molly gave him a flat look. “You are a man of mystery. I am your assistant. We have lost the dessert cart.” “What dessert cart?” “The one we just invented. Keep up.” A little voice from inside piped up, “Excuse me, are you the dessert cart people?” Molly froze. Bill froze harder. Through the crack in the door, a small girl in a princess cape stared up at them with suspiciously calm eyes. Bill smiled. “Yes?” The girl pointed at his cake box. “Is that for us?” Molly whispered, “Bill.” Bill whispered back, “I’m thinking.” The girl’s eyes narrowed. “You’re not thinking very fast.” Molly forced a laugh. “Children’s parties, eh? They keep you on your toes.” The girl crossed her arms. “Are you stealing the presents?” Bill said, “Absolutely not.” Molly said, “Absolutely maybe.” Bill hissed, “Molly.” “What? Honesty builds trust.” “With children?” “Especially children. They’re ruthless.” The girl nodded. “I knew it. My brother said thieves look weird.” Bill put a hand on his chest. “We do not look weird.” Molly glanced at his mustache. “Speak for yourself.” The girl backed away, then turned and shouted, “MUM! THE CAKE PEOPLE ARE THIEVES!” Bill grabbed Molly’s wrist. “Time to improvise.” Molly snatched the toy sword. “At last.” “Why are you smiling?” “Because,” she said, already moving through the door, “we are about to make a very noisy exit.” Behind them, the hall erupted in screams, laughter, and the patter of dozens of little feet. Bill followed her toward the gift table. “Molly,” he said, “on three?” “No.” “Why not?” “Because,” she said, scooping up a stack of presents, “I’m already on one.”
created_at
"2026-03-18T11:19:42.448Z"
3/18/2026, 11:19:42 AM
model
"openai/gpt-5.4-mini-20260317"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
440
moderation_latency
(null)
generation_time
4869
tokens_prompt
77
tokens_completion
1304
native_tokens_prompt
66
native_tokens_completion
1003
native_tokens_completion_images
0
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.004563
router
(null)
provider_responses
0
endpoint_id
"9ee065b2-3d1c-43bc-bdd7-28af3b148282"
id
"resp_05ac6486c131a5550169ba8a4e830c8190bd04996a812c7d48"
is_byok
false
latency
209
model_permaslug
"openai/gpt-5.4-mini-20260317"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773832782-LrThz591JlYRkwdmvrQ2"
upstream_id
"resp_05ac6486c131a5550169ba8a4e830c8190bd04996a812c7d48"
total_cost
0.004563
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
691
8.34%
Dialogue to Total Word Ratio
Ratio: 62.95%, Deviation: 7.05%
neededClean
false
wordsTotal
691
wordsDialogue
435
4.1723%