NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-06-01KK1HP5TX8GAYZY887G5BXPHD
Run Details
Model:
GPT-5.4 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
37.1s
Cost:
$0.016785
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
13.29%
Dialogue to Total Word Ratio
(details)
6.6473%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly crouched behind the hedge with the expression of a woman evaluating a bank vault, though the target tonight was a semi-detached house decorated with pink balloons, a cardboard unicorn, and a banner that read HAPPY 7TH, ELSIE! in letters so cheerful they felt personally insulting. Inside the back garden, a dozen children ricocheted between sugar highs and minor collisions. A man in a pirate hat was trying to organize a treasure hunt and visibly losing the will to live. On the patio, beneath a table loaded with sausage rolls and limp cucumber sticks, sat the true prize: a mountain of presents wrapped in metallic paper, stacked beside a paddling pool full of screaming toddlers. Bill lowered the binoculars. “I’ll say this for rich people, they do know how to leave loot in plain sight.” “They’re not rich,” Molly muttered. “They’re optimistic. Different disease.” Bill adjusted his caterer’s jacket, which strained around his middle with the desperation of a last thread on a chandelier. “I still don’t understand why children need this many gifts. At seven I got a boiled sweet and a cautionary look.” “At seven,” Molly said, checking her watch, “you probably tried to pawn the sweet.” “I was entrepreneurial.” “You were banned from a fête.” He considered that. “Unfairly. One incident with the tombola and suddenly you’re a villain for life.” Molly parted the hedge and studied the route again. Kitchen door propped open. Hallway visible. Gift pile six yards from the fence, three from the buffet, one exhausted spaniel asleep beneath it like an underpaid guard. The clown had arrived twenty minutes ago and was currently making balloon animals with the dead-eyed sadness of a man inflating his own regrets. “We go in during cake,” Molly said. “Maximum chaos, minimum supervision. Parents all pointing cameras, children chanting for sugar, nobody notices two extra adults moving parcels.” Bill frowned. “I hate the cake phase. There’s always singing.” “You hate all phases.” “I’m selective.” A shriek rose from the garden as one child whacked another with a foam sword. The pirate host pretended not to see. Molly smiled professionally. “Right,” she said. “You carry. I distract.” Bill put a hand to his chest. “I carry? Molly, those presents could contain anything. Board games. Science kits. One of those little electric cars. I’m not ruining my back for a generation that can’t say thank you.” “You’re not ruining your back,” she said. “You’re preserving my speed. Also, if there is an electric car in there, I call it.” He looked offended. “On what grounds?” “Superior taste.” “Favouritism.” “Experience.” Bill sniffed. “I once stole a pianola during a fireworks display.” “And dropped it in a duck pond.” “It was dark.” Molly pulled two paper party hats from her bag and handed him one. “Put it on.” He stared at it. “I have a reputation.” “You’re dressed as catering staff in a hedge.” He put on the hat. “I feel ridiculous.” “You look correct.” From inside the house came the unmistakable clatter of someone bringing out the birthday cake. The garden changed temperature instantly. Parents converged. Children swarmed. The pirate shouted, “Everyone to the table!” with the desperation of a man begging for evacuation. Molly’s eyes sharpened. “This is it.” Bill peered through the hedge. “What about the dog?” “The dog is asleep.” “The dog may wake.” “Then improvise.” “I don’t like improvising with animals. They never respect the craft.” Molly slipped through the side gate with the easy confidence of a woman entering a gala she intended to rob blind. Bill lumbered after her, balancing a silver tray for legitimacy. They passed the buffet just as the first warbling notes of Happy Birthday began, sung by thirty people in mutually hostile keys. “Smile,” Molly murmured. “I am smiling.” “You look winded.” “I am winded.” They reached the gift table. Up close it was magnificent: dinosaurs, dolls, mystery boxes, suspiciously expensive electronics disguised in cartoon paper. Bill’s eyes shone with almost paternal pride. “Good grief,” he whispered. “It’s Christmas for one very small tyrant.” “Start loading.” He swept parcels into a black catering sack hidden beneath the tray. Molly snatched the larger ones with ruthless efficiency, moving like a magpie with professional standards. The dog snored on. “Blue one,” Bill said. “Definitely a console.” “Take it.” “Rocket-shaped one?” “Take everything not nailed down.” A little voice behind them said, “That one’s from Nana.” Molly and Bill froze. A girl in a paper crown stood three feet away, cake knife in hand, frosting on her chin, regarding them with the calm curiosity of royalty catching servants pocketing silver. Bill recovered first. “Quality control.” Molly nodded briskly. “Checking for dangerous corners.” The girl narrowed her eyes. “In the presents?” “Especially in the presents,” Molly said. From the table the singing reached its final howl. The adults burst into applause. Bill clutched the sack, Molly squared her shoulders, and both thieves prepared, with professional despair, to lie to a child.
created_at
"2026-03-06T12:24:18.532Z"
3/6/2026, 12:24:18 PM
model
"openai/gpt-5.4-20260305"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
492
moderation_latency
(null)
generation_time
37025
tokens_prompt
77
tokens_completion
1469
native_tokens_prompt
66
native_tokens_completion
1108
native_tokens_completion_images
0
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.016785
router
(null)
provider_responses
0
endpoint_id
"9ff5625c-403f-4d7f-b895-58ac7295062c"
id
"resp_0592606f89b477280169aac772977081909547d643e34bf1f1"
is_byok
false
latency
154
model_permaslug
"openai/gpt-5.4-20260305"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1772799858-b4ZXuBTaMUtLSUahTHjU"
upstream_id
"resp_0592606f89b477280169aac772977081909547d643e34bf1f1"
total_cost
0.016785
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
826
13.29%
Dialogue to Total Word Ratio
Ratio: 36.35%, Deviation: 6.35%
neededClean
false
wordsTotal
828
wordsDialogue
301
6.6473%