NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-09-01KKAC3HCDA05EQFMF6SYV2ZY6
Run Details
Model:
GPT-5.4 (Reasoning, Low)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
22.8s
Cost:
$0.015210
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
29.83%
Dialogue to Total Word Ratio
(details)
14.9163%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly crouched behind the privet hedge with the solemnity of a field marshal and the hat of a badly disguised aunt. The hat was enormous, mauve, and decorated with a stuffed flamingo whose expression suggested it had died of embarrassment. Beside her, Bill peered through the leaves in a blazer two sizes too small, clutching a clipboard as if paperwork itself could render him invisible. Beyond the garden gate, the children’s party was in full barbaric swing. Balloons bumped against one another like brightly colored planets. A magician in a glitter waistcoat appeared to be losing an argument with a rabbit. Somewhere near the cake table, a child screamed with the passion of an opera singer who had just discovered betrayal. “Tell me again,” Bill whispered, “why we’re robbing a birthday party instead of somewhere sensible, like a bank?” Molly adjusted the flamingo. “Because banks have guards, cameras, and adults who pay attention. This place has eight-year-olds full of sugar and one exhausted mother pretending this was her idea.” Bill glanced at the mountain of presents stacked beneath a banner reading HAPPY BIRTHDAY OLIVER in letters so cheerful they felt aggressive. “That is a lot of loot.” “Exactly,” said Molly. “Action figures, remote-control cars, those little robot dinosaurs that roar and then stop because the batteries are tragic. There’s resale value in modern parenting.” A paper plate sailed over the fence and landed at Bill’s feet with a smear of pink icing. He looked at it as though the party had fired a warning shot. “I still say children are unpredictable,” he muttered. “Which is why we have disguises.” Bill tugged at his blazer. “I look like a substitute geography teacher having a breakdown.” “You look official,” Molly said. “I look maternal. Together, we’re unstoppable.” In the garden, a clown on stilts bent to make a balloon giraffe and instead produced something that resembled a plumbing accident. The children cheered anyway. Bill shuddered. Molly unfolded the plan on the clipboard. It was drawn in colored pencil, because she believed criminal strategy should feel festive. “We enter through the side gate. You tell anyone who asks that you’re from Treasure Trail Activities.” “We don’t exist.” “Neither does half the entertainment industry. Then I create a diversion near the cake.” Bill frowned. “What sort of diversion?” Molly smiled. “I’m going to ask whether the cake is nut-free, gluten-free, dairy-free, sugar-free, and emotionally supportive. The parents will swarm.” Bill let out a low whistle. “Diabolical.” “While they’re busy apologizing for flour, you load the presents into the caterer’s trolley.” Bill looked over the hedge again. “What if a child sees me?” “Then tell them you’re upgrading the presents.” “Children don’t believe in upgrades. They believe in ownership.” As if to prove him right, a small boy in a pirate hat marched across the lawn and bit another child for standing too near a wrapped box. No adult noticed. They were all watching the magician accidentally set his own wand on fire. Bill swallowed. “Savages.” “Focused savages,” Molly said. “Respect that.” She stood, smoothed her floral dress, and marched toward the gate with the confidence of a woman who had once convinced a jeweler she was there to inspect his diamonds for sadness. Bill followed, pushing the trolley, trying to look as though he belonged at parties and not outside them, pressed against hedges like damp laundry. At the gate, a harried father intercepted them, wearing the thousand-yard stare of a man who had inflated ninety balloons and lost all faith. “Can I help you?” he asked. Bill’s mouth opened, but nothing came out except a noise like a goose reconsidering life. Molly swept in. “Treasure Trail Activities,” she said briskly. “We’re here for Phase Two.” The father blinked. “There’s a Phase Two?” “Oh, goodness,” Molly said, lowering her voice. “No one told you? That explains the balloons.” The father paled. “I knew Sharon forgot something.” Bill, inspired by panic, nodded furiously at the trolley. “Very advanced,” he said. “Health and safety.” The father stepped aside at once. They rolled into the garden, deep into enemy territory, toward the shimmering pile of gifts. Molly gave Bill a look of shining triumph. “See?” she murmured. Bill stared at the presents, the cake, the clown, the tiny pirate preparing another assault. “I’ll say this,” he whispered back. “If we survive, we deserve a pension.”
created_at
"2026-03-09T22:39:54.775Z"
3/9/2026, 10:39:54 PM
model
"openai/gpt-5.4-20260305"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
3444
moderation_latency
(null)
generation_time
22674
tokens_prompt
77
tokens_completion
1346
native_tokens_prompt
66
native_tokens_completion
1003
native_tokens_completion_images
0
native_tokens_reasoning
37
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.01521
router
(null)
provider_responses
0
endpoint_id
"9ff5625c-403f-4d7f-b895-58ac7295062c"
id
"resp_0de55ffc197213ca0169af4c3ada1c81968ac8be0d4a0c3ea9"
is_byok
false
latency
176
model_permaslug
"openai/gpt-5.4-20260305"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773095994-cAJEBZuj8jnXdqJpyksz"
upstream_id
"resp_0de55ffc197213ca0169af4c3ada1c81968ac8be0d4a0c3ea9"
total_cost
0.01521
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
728
29.83%
Dialogue to Total Word Ratio
Ratio: 34.92%, Deviation: 4.92%
neededClean
false
wordsTotal
736
wordsDialogue
257
14.9163%