NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-18-01KM0AEYWABRSGY15462KXBXSE
Run Details
Model:
GPT-5.4 Mini
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
5.3s
Cost:
$0.004230
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
16.47%
Dialogue to Total Word Ratio
(details)
8.2343%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
“Tell me again why we’re robbing a children’s party,” Molly said, peering through the cracked blinds at the house across the street. Bill adjusted his black gloves. “Because it’s easy.” “Easy?” She turned. “There are balloon arches, a clown, and at least one parent with the reflexes of a startled hawk.” “A hawk with a cupcake,” Bill said. “They get soft around frosting.” Molly snorted. “You say that like you’ve done field research.” “I have done field research,” he said. “I attended my niece’s seventh birthday. I escaped with one shoe and a migraine, but I observed the defenses.” “The defenses,” Molly repeated. “At a child’s party.” “Exactly. Chaos is the best security system. Everyone assumes somebody else is watching the presents.” “Until somebody notices two professional thieves loading a wagon full of stuffed animals.” “Who said anything about a wagon?” Bill asked. Molly stared at him. “Bill. Please tell me you did not bring a wagon.” “No,” he said too quickly. “Bill.” “In my defense, it’s a collapsible wagon.” She pinched the bridge of her nose. “That is not a defense. That is a worse crime in a smaller package.” “It folds flat,” he said. “Very practical.” “For what? A parade? A gardening accident?” “For presents,” Bill said. “Obviously.” Molly looked back at the house. “How many presents are we talking?” Bill lifted a paper. “According to our intel, thirty-two gifts, a bounce castle, and one enormous cake shaped like a pirate ship.” “Which means the actual loot is all in one room.” “Precisely.” “And the room?” “Living room.” She blinked. “That’s it? No vault? No laser grid? No retired military dog?” “Just streamers and a sad little ‘Happy Birthday’ banner.” Molly leaned against the wall. “This feels like a trap.” “It feels like a Tuesday,” Bill said. “We go in, blend in, borrow the wagon, and wheel out before anyone realizes all the glittery envelopes are gone.” “Borrow the wagon,” Molly said. “From who? A six-year-old?” “If necessary.” She gave him a long look. “You’re joking.” “Mostly.” “I hate that you said ‘mostly.’” Bill checked his watch. “We’ve got twenty minutes before the singing starts.” “You timed the singing?” “I timed the entire event.” She pointed at him. “That’s the kind of thing that makes people suspicious, Bill.” “People don’t suspect the guy holding balloons,” he said. “You’re holding a clipboard.” “Even better. Authority.” Molly folded her arms. “And what exactly is my role in this masterpiece?” “You,” he said, “are Aunt Marjorie.” “I am not Aunt Marjorie.” “You are tonight. She’s ‘running late,’ which is why we’re bringing the gift labels.” Molly looked horrified. “You forged gift labels?” “I prefer ‘prepared.’” “You forged labels for stolen birthday presents.” “Technically, we are preserving the efficient redistribution of assets.” She laughed despite herself. “You’ve been reading my brother’s business books again.” “They’re inspiring.” “Bill, if I get tackled by a parent in khakis, I want it known I objected from the start.” “If you get tackled by a parent in khakis, I’ll distract them with the clown.” “There’s a clown?” “Of course there’s a clown.” “Why is there always a clown?” Bill shrugged. “To make the children less frightening.” Molly took a slow breath. “Fine. We go in. We smile. We act like respectable criminals.” “That’s the spirit.” “And when we leave with the presents?” “We do not run,” Bill said. “We stroll.” “With a collapsible wagon.” “With dignity.” She smirked. “You’re impossible.” “And yet,” he said, handing her a lopsided party hat, “you’re putting on the hat.” Molly stared at it, then at the house full of balloons and chaos and wrapped boxes. “If this goes wrong, I’m blaming your niece.” “Fair,” Bill said. “If it goes right, we’re getting cake.” “Deal,” she said, and together they stepped toward the children’s party, looking, if not innocent, then at least horribly committed.
created_at
"2026-03-18T11:14:29.4Z"
3/18/2026, 11:14:29 AM
model
"openai/gpt-5.4-mini-20260317"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
385
moderation_latency
(null)
generation_time
5109
tokens_prompt
77
tokens_completion
1234
native_tokens_prompt
66
native_tokens_completion
929
native_tokens_completion_images
0
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.00423
router
(null)
provider_responses
0
endpoint_id
"9ee065b2-3d1c-43bc-bdd7-28af3b148282"
id
"resp_005e35539d1911ff0169ba891574708190bb54a8909e8b1f7e"
is_byok
false
latency
189
model_permaslug
"openai/gpt-5.4-mini-20260317"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773832469-nafzfLYNkXg9ol2XNOxW"
upstream_id
"resp_005e35539d1911ff0169ba891574708190bb54a8909e8b1f7e"
total_cost
0.00423
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
643
16.47%
Dialogue to Total Word Ratio
Ratio: 76.01%, Deviation: 6.01%
neededClean
false
wordsTotal
646
wordsDialogue
491
8.2343%