NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-09-01KKAC2YG1SRFNFD1T393V8P7D
Run Details
Model:
GPT-5.4 (Reasoning, Low)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
29.6s
Cost:
$0.015810
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
20.72%
Dialogue to Total Word Ratio
(details)
10.3588%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly and Bill crouched behind the inflatable castle van, which was painted with smiling dragons and the words BOUNCE-TASTIC KINGDOM in letters so cheerful they felt personally insulting. From the church hall came the shrieks of sugar-drunk children and the muffled pulse of a novelty dance playlist. Every few seconds the door swung open, releasing a blast of warm air scented with pizza, icing, and the sort of sticky optimism Molly had spent her adult life avoiding. Bill consulted a folded hand-drawn map with the gravity of an admiral preparing for naval war. Unfortunately, the map was on the back of a takeaway menu, and one corner was greasy with garlic sauce. “The presents are stacked on the left of the stage,” he whispered. “Under the banner saying HAPPY BIRTHDAY, OSCAR!!! Three exclamation marks. Wealthy family.” Molly peered round the van. Through the window she could see a mountain of brightly wrapped gifts, glowing under disco lights like treasure designed by maniacs. “Children don’t receive gifts,” she muttered. “They receive evidence. Every toy in there makes a noise, leaks glitter, or requires batteries shaped like emotional blackmail.” Bill looked hurt. “A present is a present.” “A diamond necklace is a present. A remote-control dinosaur that roars in Spanish is a domestic incident.” He tapped the map. “Focus. We enter through the side kitchen. Cross behind the magician during the applause. Bag the loot. Exit through the car park.” Molly narrowed her eyes. “You said there’d be no magician.” “There wasn’t supposed to be.” Inside, a voice boomed, “And for my next trick, who wants to see a rabbit in a hat?” Twenty children screamed as though salvation had arrived wearing sequins. Bill winced. “Improvisation is the soul of crime.” “No,” said Molly, “improvisation is the reason you once stole a wax museum fire extinguisher.” “That was under pressure.” “That was under signage.” A little girl in fairy wings skipped past the van carrying a paper cup of orange squash. Molly and Bill froze until she vanished around the corner, humming to herself. Bill slowly exhaled. “We need disguises,” he said. “We’re standing behind a van full of inflatable castles. If this gets any more subtle, we’ll be arrested by irony.” Bill reached into a duffel bag and proudly produced two oversized mascot heads: one lion, one rabbit. Molly stared at the lion’s vacant stitched grin. “You robbed a theme park, didn’t you?” “Borrowed from a costume hire depot.” “You robbed a costume hire depot.” “Temporarily misallocated assets.” Molly snatched the lion head. “Fine. I’m not being the rabbit. Rabbits are suspicious.” “Since when?” “Since magicians. Keep up.” They pulled on the costumes. The bodies were too short in the arms and too broad in the hips, giving both thieves the shape of depressed toddlers. Bill, inside the rabbit, looked especially untrustworthy, which was impressive because he usually looked untrustworthy dressed as himself. They slipped through the side door into the kitchen, stepping around trays of crisps and a woman icing cupcakes with the dead-eyed speed of somebody who had given up on elegance two hours ago. She glanced at them once and merely said, “You’re early,” before piping another violent blue rose. “See?” Bill whispered. “Perfect cover.” “Incredible,” Molly said. “We’ve become background nonsense.” They edged into the hall. Onstage, the magician was extracting scarves from somewhere anatomically distressing. Children clapped. Parents filmed with the weary reverence of tourists at an active volcano. The presents sat ten yards away, unattended except for one boy in a paper crown who was poking a box and asking if it was alive. “Oscar,” said Bill. “Target adjacent.” “Never call a child a target.” “What do I call him?” “A complication with jelly on his face.” They shuffled closer, waving like hired entertainment. Two toddlers waved back with solemn awe. One hugged Molly’s lion leg. She patted his head in panic, then immediately checked her pocket to make sure he hadn’t pickpocketed her out of instinct. At the gift pile Bill opened the sack. “Go, go, go.” Molly began sweeping in parcels. “Who wraps a yo-yo in this much tape?” “People with trust issues.” A voice behind them said, “Mummy, why is the rabbit stealing my scooter?” Molly and Bill turned slowly. Half the party was looking at them now. The magician lowered his wand. Bill straightened, clutching three presents and a scooter with pink streamers. “Good news,” he whispered. “Diversion phase has begun.” “There was no diversion phase.” “There is now.” And then the rabbit ran for it.
created_at
"2026-03-09T22:39:35.436Z"
3/9/2026, 10:39:35 PM
model
"openai/gpt-5.4-20260305"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
2841
moderation_latency
(null)
generation_time
29474
tokens_prompt
77
tokens_completion
1424
native_tokens_prompt
66
native_tokens_completion
1043
native_tokens_completion_images
0
native_tokens_reasoning
33
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.01581
router
(null)
provider_responses
0
endpoint_id
"9ff5625c-403f-4d7f-b895-58ac7295062c"
id
"resp_038e0a578e0ccf300169af4c2798f881938c094b5ad85ab61e"
is_byok
false
latency
274
model_permaslug
"openai/gpt-5.4-20260305"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773095975-yW6fvHd8JCrKFdd2O9Xn"
upstream_id
"resp_038e0a578e0ccf300169af4c2798f881938c094b5ad85ab61e"
total_cost
0.01581
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
755
20.72%
Dialogue to Total Word Ratio
Ratio: 35.61%, Deviation: 5.61%
neededClean
false
wordsTotal
761
wordsDialogue
271
10.3588%