NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-18-01KM08VYA2A1EFDMJXEX0M8TAK
Run Details
Model:
GPT-5.4 Mini
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
6.7s
Cost:
$0.004730
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
39.92%
Dialogue to Total Word Ratio
(details)
19.9585%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
“Bill,” Molly whispered, staring at the balloon-drenched house across the street, “tell me again why we’re doing a heist at a children’s party.” “Because,” Bill whispered back, adjusting his black gloves with theatrical sorrow, “the neighborhood’s security is at its weakest when adults are distracted by sugar, chaos, and tiny humans screaming about cake.” “You say that like it’s noble.” “It is noble. In a deeply embarrassing way.” Molly peered through the hedge. A banner reading HAPPY BIRTHDAY, TILLY! sagged over the porch, flanked by two plastic unicorns wearing party hats. “There are clown shoes on the welcome mat, Bill.” “Yes,” Bill said. “That’s the real security system. Nobody with clown shoes is thinking clearly.” From inside came the muffled roar of children, the shrill crack of a balloon, and a parent calling, “No running with the cupcakes!” It sounded less like a party and more like a minor emergency in frosting. Molly folded her arms. “We’re stealing presents from a seven-year-old.” “Eight,” Bill corrected. “That makes it better?” “It makes the math more specific.” Molly turned to him. “You told me this was a professional job.” “It is professional,” he said, with hurt dignity. “I wore my quiet shoes.” “They squeak.” “They’re quiet in spirit.” A small boy sprinted past the window inside, dragging a cape made from aluminum foil. Bill ducked. “See? It’s already escalating. They’ve got a child dressed as a microwave.” “That’s a superhero,” Molly said. “Then why is he yelling, ‘I am the oven king’?” Molly sighed and checked the clipboard she’d stolen from their own apartment, because Bill believed a heist felt more legitimate with paperwork. “We slip in through the side gate, blend with the catering staff, lift the gifts, and leave before anyone notices.” “Exactly,” Bill said. “Simple, elegant, and morally wobblier than a shopping cart with one bad wheel.” Molly glanced at him. “You brought a folding cart?” “Of course. We are not carrying forty-seven wrapped objects by hand. I have standards.” “You also brought fake mustaches.” Bill touched the one perched under his nose. “These are not fake. They are aspirational.” The back gate clicked open with a gust of helium-scented air, and Molly froze. “Why did you do that?” Bill blinked. “I didn’t.” A little girl in a tutu poked her head out. “Are you the balloon people?” Bill straightened instantly. “Yes. Absolutely. We are the balloon people.” Molly stared at him. “Since when are we the balloon people?” “Since now,” he said without looking at her. “Excellent timing, team. We’re here to... inflate the, uh, gifts.” The girl squinted. “My mom said the balloon people were coming with a clown.” Bill spread his arms. “I am the clown.” Molly closed her eyes. “Bill.” “What?” he hissed. “She’s trusting. It’s beautiful.” Inside, another child shouted, “The pirate ate the jelly beans!” “That’s our cue,” Molly muttered. Bill nodded gravely. “Remember: confidence, speed, and absolutely no eye contact with the birthday cake.” “Why no eye contact with the cake?” “Because cakes know things.” They slipped into the kitchen, where a tower of wrapped boxes sat beside trays of juice boxes and a suspiciously large piñata. Molly tugged the cart into place. “All right. We load fast.” Bill leaned over the pile and whispered, reverent as a priest. “Hello, beautiful.” “Don’t flirt with the evidence.” “I’m not flirting. I’m appreciating craftsmanship. Look at that ribbon work.” Molly snatched up a box. “This one’s heavy.” “Educational toy, maybe.” She lifted another. “This one rattles.” “Loud toy.” “This one says To Tilly, Love Grandma.” Bill winced. “That one’s got emotional weight.” From the hall came footsteps and the voice of a parent. “Has anyone seen the balloon people?” Bill shoved a stack of gifts onto the cart. “Move, move, move.” Molly grabbed the handles. “If we get caught, I’m blaming your mustache.” “Fair,” Bill said, pushing beside her. “If we get away, I’m taking credit for the ambiance.” They rolled toward the side door just as the birthday girl shouted from the other room, “Wait! The clown left his bag!” Molly stopped dead. “Bill.” Bill looked back at the doorway, then at the cart piled with presents, then at the bag forgotten near the cake table. His eyes widened. “That is our bag.” “You brought a bag?” “For the presents.” “For which presents?” He grinned, already moving. “All of them, Molly. We’re professionals.”
created_at
"2026-03-18T10:46:37.646Z"
3/18/2026, 10:46:37 AM
model
"openai/gpt-5.4-mini-20260317"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
612
moderation_latency
(null)
generation_time
6268
tokens_prompt
77
tokens_completion
1335
native_tokens_prompt
66
native_tokens_completion
1040
native_tokens_completion_images
0
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0047295
router
(null)
provider_responses
0
endpoint_id
"9ee065b2-3d1c-43bc-bdd7-28af3b148282"
id
"resp_0bfc4d55c41553530169ba828dd2f4819681bd038f328e1654"
is_byok
false
latency
401
model_permaslug
"openai/gpt-5.4-mini-20260317"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773830797-Mmqwen1zuFHeCfGj38Go"
upstream_id
"resp_0bfc4d55c41553530169ba828dd2f4819681bd038f328e1654"
total_cost
0.0047295
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
730
39.92%
Dialogue to Total Word Ratio
Ratio: 54.29%, Deviation: 4.29%
neededClean
false
wordsTotal
735
wordsDialogue
399
19.9585%