NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-09-01KKAC0ZN38703JNN4K49EZXSQ
Run Details
Model:
GPT-5.4 (Reasoning, Low)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
28.2s
Cost:
$0.015435
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
39.08%
Dialogue to Total Word Ratio
(details)
19.5410%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly crouched behind the hydrangeas with the solemnity of a church gargoyle and the knees of a folding chair. Through the hedge she could see the children’s party in full swing: a bouncy castle swelling and sighing like an asthmatic whale, a magician losing an argument with a dove, and a long table piled high with presents in violent wrapping paper. Bill lay beside her on the damp grass, peering through opera glasses he’d stolen from a retired soprano and never stopped bragging about. He wore a black turtleneck, black gloves, and a party hat he had forgotten to remove after blending in at a neighboring barbecue. “Tell me again,” Molly whispered, “why we’re robbing a six-year-old?” Bill lowered the glasses. “We are not robbing a six-year-old. We are liberating consumer goods from a hostile environment.” “The hostile environment being balloons?” “And cake,” Bill said gravely. “Never trust an event with sheet cake.” From the garden came a shriek of delighted chaos. A clown on stilts waddled past the gift table carrying a tray of orange squash. The presents gleamed in the sun like a row of gaudy little hostages. Molly adjusted the silk scarf over her hair. “There must be thirty boxes.” “Thirty-two,” Bill said. “I counted the dinosaur one twice because it looked expensive.” “Do we know what’s in them?” “Not exactly. But statistically? At least one remote-control car, two educational disappointments, and something that requires batteries the size of torpedoes.” Molly sniffed. “I’m not risking prison for a plastic microscope.” Bill gave her an injured look. “Molly, please. We’re artists. We don’t steal for need, we steal for mystery.” “That may be the stupidest thing you’ve ever said, and I once heard you try to flirt in French.” A child in a pirate costume ran across the lawn howling. Three more followed, all waving foam swords. One struck the clown on the shin. The clown folded in half with the silent dignity of a failed politician. Bill studied the scene. “Security is collapsing.” “There is no security. It’s a trampoline and some fairy bread.” “Exactly,” Bill said. “Complacency. The deadliest guard of all.” Molly looked at the side gate. Beyond it, the gift table sat only twenty feet from the shrubbery. Between them stood a woman in a sunflower dress, a man wrestling with a barbecue lid, and a Labrador wearing a bow tie and the expression of a dog who knew too much. “The dog worries me,” she said. “The dog is in formalwear. He’s not here to work.” A gust of wind lifted a corner of the tablecloth. One present tumbled off and burst open on the grass. A little girl snatched up the contents—a box of glitter pens—and screamed as if she’d discovered buried treasure. Molly watched the children swarm around her. “You know, they do seem alarmingly happy.” Bill blinked. “That’s temporary. Childhood is basically bad judgment in pigtails.” “You don’t think this is a bit low?” He considered the mountain of gifts, the paper hats, the wobbling magician now trapped in his own curtain. Then his face softened in a way Molly disliked, because it usually meant conscience was about to spoil business. “Maybe,” he admitted. “Maybe it is.” Molly sat back on her heels. “Well. That’s inconvenient.” Bill pointed suddenly toward the buffet table. “Counterproposal.” She followed his finger. Beside the cake sat a stack of catering tins, and in them, like treasure under weak guard, were sausage rolls, mini quiches, and a regiment of iced cupcakes. Molly’s eyes widened. “You want to steal the food?” Bill grinned. “Nobody remembers who gave them a spelling game. Everybody remembers the lunatics who made off with the pastries.” From the lawn came the cry of a child discovering a piñata. The adults surged toward it in a distracted herd. Molly rose, brushing grass from her skirt. “All right. We take the sausage rolls, the cupcakes, and if fate is kind, that wheel of cheese.” Bill adjusted his party hat like a general donning a helmet. “Now this,” he said, “is a victimless crime.” The Labrador looked straight at them, barked once, and began to run.
created_at
"2026-03-09T22:38:31.086Z"
3/9/2026, 10:38:31 PM
model
"openai/gpt-5.4-20260305"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
3129
moderation_latency
(null)
generation_time
28101
tokens_prompt
77
tokens_completion
1144
native_tokens_prompt
66
native_tokens_completion
1018
native_tokens_completion_images
0
native_tokens_reasoning
92
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.015435
router
(null)
provider_responses
0
endpoint_id
"9ff5625c-403f-4d7f-b895-58ac7295062c"
id
"resp_072336f9d2f650f80169af4be72b688194b12cbc1a0d36cb6e"
is_byok
false
latency
282
model_permaslug
"openai/gpt-5.4-20260305"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773095911-lxmVw96bfGard51I6w2X"
upstream_id
"resp_072336f9d2f650f80169af4be72b688194b12cbc1a0d36cb6e"
total_cost
0.015435
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
694
39.08%
Dialogue to Total Word Ratio
Ratio: 34.33%, Deviation: 4.33%
neededClean
false
wordsTotal
699
wordsDialogue
240
19.5410%