NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-09-01KK9QFEJS8AX2MYHA6Z7EXQB9
Run Details
Model:
GPT-5.4 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
37.7s
Cost:
$0.017745
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
59.53%
Dialogue to Total Word Ratio
(details)
29.7648%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The van idled beneath a banner of municipal disapproval that read SLOW FOR CHILDREN, which Bill considered less a road sign than a tactical insult. Across the street, number eighteen blazed with balloons, paper streamers, and the sort of noisy joy that suggested cake had already entered the bloodstream. Molly adjusted the tiny binoculars she’d bought from a museum gift shop, because nothing said professional criminal like equipment packaged for bird-loving retirees. Through the lace-curtained front window, she could see a mountain of presents stacked beside a table wearing a dinosaur cloth and several bowls of weaponized sugar. “Bill,” she said, “tell me again why we’re robbing a six-year-old.” “We are not robbing a six-year-old,” Bill said, offended. “We are intercepting unguarded luxury goods before they are smeared with jam.” “Luxury goods?” “Toy robots, remote-control cars, a chemistry set, three boxes the exact size of expensive electronics, and one suspiciously large gift bag with glitter tissue. Glitter is the wrapping paper of money.” A child shrieked inside. Another answered in a tone usually reserved for air-raid sirens. Molly lowered the binoculars. “I hate this already.” “You hate all jobs before the money appears.” “I liked the aquarium job.” “You liked the gift shop.” “It was soothing, Bill. Postcards. Plush octopuses. No one screamed because someone else got the blue plate.” Bill unfolded the hand-drawn map on his knees. It was a masterpiece of bad information. The house looked less like architecture and more like a cow standing on a trampoline. “Entry point,” he said, tapping a blob marked KITCHEN???, “is the side gate. We mingle with the adults, reach the present pile, deploy the sacks, and retreat during pass-the-parcel.” “Your plan depends on adults at a children’s party paying no attention.” “Have you met adults at children’s parties? They are spiritually absent by the second balloon animal.” Molly considered this. From the front garden, a man in a paper hat carried out a tray of orange drinks with the expression of someone serving the army during a siege. “All right,” she said. “Disguises?” Bill held up two cone hats and a roll of stickers. “Festive infiltration.” “That is not a disguise. That is surrender.” “It’s context, Molly. If people see a cone hat, they think, Ah yes, a fellow idiot.” She stared at him. “And what exactly am I?” “The aunt from Leeds.” “I don’t sound like I’m from Leeds.” “You don’t need to. Adults never ask follow-up questions at parties. They only say things like, ‘Whose is this one?’ and ‘Don’t lick the magician.’” Molly snorted despite herself. “And you?” “I am Bill.” “That’s not a role.” “It’s my greatest performance. I look as if I belong nowhere and therefore everywhere.” From inside the house came the muffled boom of a hired entertainer losing control of a room. A ripple of cheers followed, then the distinct clatter of something breakable becoming theoretical. Molly took a breath. “Suppose we get cornered by a parent.” “Easy,” Bill said. “We complain about parking.” “Suppose we get cornered by a child.” “We ask where the bathroom is and leave before they answer.” “Suppose the magician sees us.” Bill’s confidence faltered. “Then we run. Men in capes are unpredictable.” She smiled, sharp and quick. “You know what the real problem is?” “What?” “The presents are all wrapped. We won’t know what’s worth taking.” Bill leaned back, grinning like a saint of poor decisions. “Molly, if there is one universal truth, it is this: at a child’s party, every box is worth taking from somebody.” Across the street, the front door opened. A woman in a crown of curls shouted, “Has anyone seen the spare candles?” Bill slapped the cone hat onto his head. Molly sighed, reached for the stickers, and opened the van door. “Right,” she said. “Let’s go become somebody’s least favorite memory.”
created_at
"2026-03-09T16:39:25.027Z"
3/9/2026, 4:39:25 PM
model
"openai/gpt-5.4-20260305"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
9702
moderation_latency
(null)
generation_time
37568
tokens_prompt
77
tokens_completion
1254
native_tokens_prompt
66
native_tokens_completion
1172
native_tokens_completion_images
0
native_tokens_reasoning
296
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.017745
router
(null)
provider_responses
0
endpoint_id
"9ff5625c-403f-4d7f-b895-58ac7295062c"
id
"resp_0a3978aef48ce2b10169aef7bd1cd8819689285d1db7ace792"
is_byok
false
latency
181
model_permaslug
"openai/gpt-5.4-20260305"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773074365-Gv3I1cgb50prHRt0e1mv"
upstream_id
"resp_0a3978aef48ce2b10169aef7bd1cd8819689285d1db7ace792"
total_cost
0.017745
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
640
59.53%
Dialogue to Total Word Ratio
Ratio: 53.22%, Deviation: 3.22%
neededClean
false
wordsTotal
652
wordsDialogue
347
29.7648%