NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-09-01KK9Q7Y9D8HQ70KQEQ03HJB9F
Run Details
Model:
GPT-5.4 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
29.6s
Cost:
$0.016830
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
35.47%
Dialogue to Total Word Ratio
(details)
17.7352%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Rain polished the alley behind Mrs. Wimple’s Community Hall until the bins shone like guilty emeralds. Above them, paper lanterns bobbed in the upstairs windows, and every few seconds the glass flashed with the colors of a magician’s wand or a balloon sword duel gone terribly diplomatic. Somewhere inside, twenty children were shrieking in the key of sugar. Molly crouched beneath a sagging poster for tap-dancing lessons and peered at the kitchen door through opera glasses she’d stolen from a disappointed aunt in 2018. Bill stood beside her in a caterer’s jacket that fit like a pillowcase on a lamp. “This,” said Molly, lowering the glasses, “is either our finest hour or the exact moment we should have become florists.” Bill adjusted his paper hat. “Florists get up at dawn. Criminals have standards.” The wind carried the smell of icing, sausage rolls, and fresh wrapping paper. Across the alley, a cardboard pirate was taped to a window, smiling with all the confidence of a man who had never met actual thieves. Molly unfolded the plan from her pocket. It had once been a bus timetable and still contained, inconveniently, the 42 to Dagenham. She tapped it with a gloved finger. “Kitchen entrance here. Hallway here. Present table near the bouncy castle. We go in with the cake trolley, nod like professionals, fill the empty cake boxes with gifts, and leave before anyone notices that the magician has accidentally set fire to his own scarf again.” Bill frowned. “And if there isn’t a cake trolley?” “There is always a cake trolley. Civilisation depends on it.” From upstairs came a mighty chorus of children chanting for someone named Trevor to open the big one. Bill visibly blanched. “That’s our problem right there,” he whispered. “A party reaches present-opening and suddenly time matters.” Molly looked up at the lit windows, calculating. She had the expression of a woman who could estimate the resale value of a toy robot by the sound of its batteries. “Then we move fast. No dithering, no souvenirs, and absolutely no getting emotionally attached to anything plush.” Bill held up both hands. “One time. One time I rescued a stuffed giraffe from a skip.” “You named it Admiral Longneck and tried to teach it cards.” “He was a quick study.” She folded the plan and slipped it away as the kitchen door banged open. A harassed man in an apron stumbled out, carrying a tower of pizza boxes and muttering about napkins, children, and the end of patience as a concept. He propped the door with his foot, shouted for someone called Denise, then disappeared back inside, forgetting both the door and the boxes. Molly and Bill stared. Bill smiled first. “The universe has sent us room service.” “Don’t get spiritual on me,” Molly said, though she was already moving. They crossed the alley with the solemn speed of people pretending not to run. Inside the kitchen, steam clouded the windows and three adults were arguing over candles shaped like dinosaurs. Nobody looked at them. Molly slid the pizza boxes onto a rack, seized two empty cake cartons, and marched for the hall. Bill followed, carrying himself with the hunched confidence of a man who hoped invisibility might be contagious. The party exploded around them. Balloons drifted at ankle height. A clown on stilts drank tea in a corner with the grave expression of an exhausted bishop. At the far end of the room stood the present table, glittering beneath a banner that read HAPPY 7TH BIRTHDAY, OLIVE! and sagged on the Y like a warning. “Remember,” Molly murmured, smiling at a passing grandmother, “we are invisible professionals.” Bill smiled back at the grandmother so hard she recoiled. “I’m nailing it.” They reached the table. Molly flipped open a carton and began transferring parcels with crisp, efficient hands: dinosaurs, dolls, suspiciously expensive plastic bricks. Bill grabbed a gift shaped like a telescope and another that rattled ominously. Then a small voice piped up behind them. “That one’s mine,” said a little boy in a paper crown, pointing at Bill’s carton. “I know because I wrapped it badly.” Bill turned with the expression of a man caught burgling by a tax inspector. Molly, without missing a beat, bent to the boy’s level. “Excellent,” she said. “Quality control. You can never be too careful with wrapping.”
created_at
"2026-03-09T16:35:18.972Z"
3/9/2026, 4:35:18 PM
model
"openai/gpt-5.4-20260305"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
4133
moderation_latency
(null)
generation_time
29537
tokens_prompt
77
tokens_completion
1164
native_tokens_prompt
66
native_tokens_completion
1111
native_tokens_completion_images
0
native_tokens_reasoning
179
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.01683
router
(null)
provider_responses
0
endpoint_id
"9ff5625c-403f-4d7f-b895-58ac7295062c"
id
"resp_0df0a386cfcfb5ee0169aef6c71fe48196ba349ed70f501a08"
is_byok
false
latency
301
model_permaslug
"openai/gpt-5.4-20260305"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773074118-t9b0bOWBFSryNcphV1Fu"
upstream_id
"resp_0df0a386cfcfb5ee0169aef6c71fe48196ba349ed70f501a08"
total_cost
0.01683
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
725
35.47%
Dialogue to Total Word Ratio
Ratio: 25.45%, Deviation: 4.55%
neededClean
false
wordsTotal
727
wordsDialogue
185
17.7352%