NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-06-01KK1JAPG2FQPDVGH9XRYNYXK5
Run Details
Model:
GPT-5.4
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
32.6s
Cost:
$0.017025
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.71%
Dialogue to Total Word Ratio
(details)
0.3535%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly and Bill crouched behind the municipal rosebushes, which had seen better summers and perhaps better criminals. Beyond the hedge, the children’s party blazed in the church hall garden with pastel bunting, shrieking games, and a trestle table stacked so high with wrapped presents it looked like a department store had crashed into a rainbow. Bill adjusted the tiny pair of binoculars he’d bought from a petrol station. “I’m just saying,” he whispered, “morally, this may be our most challenging job.” Molly peered over the hedge, expression cool and businesslike. “Bill, last month you stole a ceremonial goat from a mayor.” “That goat was corrupt.” “Presents can’t be corrupt.” “They can if they’re educational.” A clown on stilts wobbled past the gift table like an omen no sensible person would ignore. Molly consulted a hand-drawn map on the back of a takeaway menu. Her plan had arrows, timings, and one note that simply said Avoid magician. “Right,” she said. “The table is six metres from the gate. Cake station to the left, inflatable castle to the right. Parents concentrated near the prosecco tub. Children currently occupied with something involving a parachute and screaming.” Bill nodded solemnly. “Classic battlefield confusion.” “Exactly. At two-thirty, the entertainer starts balloon animals. That’s our window. You go in disguised as an uncle no one remembers. I’ll cut behind the herb garden, lift the lot into the laundry basket, and we’re out through the side alley.” Bill lowered the binoculars. “Have you noticed all the presents have names on them?” Molly looked at him. “Yes, Bill. That’s how presents work.” “It adds a layer of intimacy I’m not used to.” “Stealing silver candlesticks with monograms didn’t trouble you.” “Those were adults. Adults understand disappointment.” From the lawn came a shriek of delight as a woman in a mermaid costume produced bubbles the size of dinner plates. Bill watched them drift. “I had parties like this once,” he said wistfully. “Pass the Parcel. Jelly. An uncle who smelt of aquarium gravel.” Molly softened for nearly half a second. “Did you get nice presents?” “One year I got a detective kit. Took fingerprints off the cat for a month.” “And now look at you. Full circle.” He sighed. “Do you ever worry we’re products of our environment?” “I worry you’ll ruin this with your conscience.” A small boy in a paper crown ran to the gift table, added another parcel, and scampered off. Bill flinched. “That one looked hand-wrapped.” “All the better resale value.” “You can’t resell a hand-knitted dinosaur jumper.” “Watch me.” She folded the map and tucked it into her sleeve. The church hall doors opened and a harassed father emerged carrying three pizzas and the thousand-yard stare of a man who had organized musical chairs for twenty-seven six-year-olds. Molly tapped Bill’s shoulder. “Showtime. Chin up. Try to look like you’ve paid for snacks.” “What does that look like?” “Resentful.” Bill straightened his jacket and picked a stray leaf from his lapel. “If anyone asks, I’m Uncle Brian from Swindon.” “There is no Swindon energy in you. Less apologetic. More vague.” He took a breath. “Right. Uncle Brian. Vague, burdened, possibly divorced.” “Perfect.” They slipped from behind the hedge and crossed the lane with the brisk, invisible confidence of people who had no business being anywhere near a bouncy castle. At the gate, a grandmother in a cardigan shaped like a moral judgment smiled at Bill. “Oh good,” she said. “You must be Pippa’s father.” Bill froze. Molly answered instantly. “Separated,” she said, with a tragic nod. The grandmother’s face rearranged itself into welcome and gossip. “Say no more.” Inside the garden, chaos embraced them like family. Children thundered past with pirate hats and melted chocolate on their cheeks. The clown folded a balloon poodle that resembled a plumbing emergency. Bill leaned close. “I’ve got eyes on the target.” “Then stop narrating and move.” He drifted toward the present table, adopting the hollow stare of a man who had recently discussed school catchment areas. Molly veered toward the side path, seized an abandoned laundry basket full of plastic swords, tipped them neatly into a shrub, and slid into position. “Now,” she hissed. Bill reached the table, smiled at nobody, and began loading presents into the basket with alarming speed. A little girl appeared at his elbow. “What are you doing?” Bill blinked. “Quality control.” She considered this. “Mum says that’s important.” “Your mum is absolutely right.” Molly swept in, lifted the basket handles, and flashed the child a conspiratorial grin. “Special present inspection unit.” The girl gasped. “Are you from the council?” “Worse,” said Molly, and marched for the gate. For one miraculous second, it almost worked. Then the magician spotted them, pointed with a wand, and shouted, “Oi! That’s the presents!” Every adult turned. Bill broke into a run beside Molly. Behind them rose uproar, indignation, and one oddly supportive cheer from a child who clearly admired initiative.
created_at
"2026-03-06T12:35:30.953Z"
3/6/2026, 12:35:30 PM
model
"openai/gpt-5.4-20260305"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
379
moderation_latency
(null)
generation_time
32545
tokens_prompt
77
tokens_completion
1445
native_tokens_prompt
66
native_tokens_completion
1124
native_tokens_completion_images
0
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.017025
router
(null)
provider_responses
0
endpoint_id
"9ff5625c-403f-4d7f-b895-58ac7295062c"
id
"resp_0148df8d6b6e6eaa0169aaca13045c819480d070880879a24c"
is_byok
false
latency
172
model_permaslug
"openai/gpt-5.4-20260305"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1772800530-algffD3DPHDTRNmshGBI"
upstream_id
"resp_0148df8d6b6e6eaa0169aaca13045c819480d070880879a24c"
total_cost
0.017025
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
823
0.71%
Dialogue to Total Word Ratio
Ratio: 39.95%, Deviation: 9.95%
neededClean
false
wordsTotal
831
wordsDialogue
332
0.3535%