NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-18-01KM0AE6XAV330YCBHJZH3ZD45
Run Details
Model:
GPT-5.4 Mini (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
9.1s
Cost:
$0.006597
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
16.70%
Dialogue to Total Word Ratio
(details)
8.3500%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly crouched behind the hedges outside the community hall, peering through a window bright with balloon reflections. “Tell me again why we’re dressed like caterers.” Bill tugged at the white apron tied over his coat. “Because I lost the janitor uniforms.” “You said you had them.” “I said I had a feeling.” Molly stared at him. “Bill, feelings do not come with spare keys.” “Some do.” “Not yours.” Inside, children’s laughter bobbed through the glass like marbles in a tin. A banner reading HAPPY BIRTHDAY, ZOE! swayed above a table loaded with cupcakes, glittering juice boxes, and a mountain of wrapped presents. Bill squinted. “There they are. All the loot in one place.” Molly sighed. “You mean the gifts.” “I prefer the professional term.” “There is no professional term for stealing presents from a six-year-old’s party.” Bill adjusted the paper chef’s hat perched on his head at a crooked angle. “Then the whole industry has been too casual.” Molly pressed two fingers to her forehead. “What exactly was your plan?” “My plan was elegant.” “That’s never reassuring.” “We stroll in, look official, perhaps hold a clipboard. No one questions a clipboard.” “Bill, you can’t spell your own name without pausing for inspiration.” “I can spell ‘clipboard.’” “You spelled it wrong on the fruit basket invoice.” “I was under pressure.” A tiny hand appeared at the window, smearing frosting against the glass. Bill waved back before remembering himself. The hand vanished. A moment later, a little voice rang from inside, “Mom! The cake man is making faces!” Molly winced. “We are not the cake people.” “We could be. It’s a market with growth potential.” “The market with growth potential is prison.” Bill frowned at the presents. “All those ribbons. All that paper. It’s practically begging for a strategic redistribution.” Molly turned to him sharply. “Say ‘strategic redistribution’ to me again and I’ll redistribute your teeth.” He held up both hands. “Fine. Fine. I’m just saying it’s a waste. Kids get too many toys.” “Bill.” “Not all kids, obviously. Just the ones with parties.” “Bill.” He followed her gaze into the hall. A small girl in a sparkly hat was trying to tie a balloon to a stuffed giraffe while her father knelt beside her, patient and smiling. For a second, Bill’s expression softened into something almost sensible. “Well,” he muttered, “that is annoying.” Molly looked at him. “What is?” “All the happy.” She snorted. “You’re having a conscience. It’s ugly on you.” “I’m not saying I approve of joy.” He leaned closer to the window. “I’m saying I expected more alarms.” “There are children, not vaults.” “Children are worse. They ask questions.” “Exactly. Which is why we are leaving.” Bill glanced at the apron, then back at the room. “We came all this way for nothing.” “No,” Molly said. “We came all this way to prove, once again, that you have terrible ideas.” He sighed. “Could we maybe steal the cake?” “You are impossible.” “I’d settle for one cupcake.” Molly’s mouth twitched despite herself. “If you walk into that party and steal one cupcake, I’ll tell everyone you cried during a bank robbery.” “I did not cry.” “You whined emotionally.” “That’s slander.” “It’s memoir.” Bill rubbed his chin, considering the hall with the solemnity of a man weighing a jewel score and a moral awakening at the same time. “All right,” he said at last. “No presents.” Molly blinked. “Really?” “Really.” She narrowed her eyes. “What’s the catch?” He pointed to a stack of unopened paper plates by the doorway. “We take those.” Molly stared. Then she laughed, quick and helpless. “Paper plates?” “Bigger target. Less security.” “That’s not a heist. That’s housekeeping.” Bill grinned. “Exactly. Clean getaway.” From inside, the party erupted in a round of singing, off-key and earnest. Molly listened, then looked at Bill. “You know what?” “What?” “You’re still not coming in.” “That’s fair.” “And if anyone asks, we were never here.” Bill tipped his absurd chef’s hat. “Naturally. A true professional disappears with dignity.” Molly took his arm and steered him away from the window. “Come on, genius. We’re not robbing children.” “Speak for yourself,” he said. “I’m stealing a cupcake on the way out.”
created_at
"2026-03-18T11:14:04.912Z"
3/18/2026, 11:14:04 AM
model
"openai/gpt-5.4-mini-20260317"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
7059
moderation_latency
(null)
generation_time
8262
tokens_prompt
77
tokens_completion
1711
native_tokens_prompt
66
native_tokens_completion
1455
native_tokens_completion_images
0
native_tokens_reasoning
470
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.006597
router
(null)
provider_responses
0
endpoint_id
"9ee065b2-3d1c-43bc-bdd7-28af3b148282"
id
"resp_0f19bd2d96f4d9d00169ba88fcf6b88197a3b8d00bcac9e405"
is_byok
false
latency
175
model_permaslug
"openai/gpt-5.4-mini-20260317"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773832444-U6tmqWPatRIpVbPGZtCu"
upstream_id
"resp_0f19bd2d96f4d9d00169ba88fcf6b88197a3b8d00bcac9e405"
total_cost
0.006597
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
699
16.70%
Dialogue to Total Word Ratio
Ratio: 55.98%, Deviation: 5.98%
neededClean
false
wordsTotal
702
wordsDialogue
393
8.3500%