NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-06-01KK1H6A2N7VM7JNZKV5QNE1P5
Run Details
Model:
GPT-5.4 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
22.5s
Cost:
$0.012105
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
79.89%
Dialogue to Total Word Ratio
(details)
39.9437%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly and Bill crouched behind the drooping hedge of number 14, looking like two budget gargoyles posted outside a suburban palace of balloons. The house blazed with cheerful lights. Paper lanterns bobbed in the dusk, and from the back garden came the shriek of sugar-powered children ricocheting off a hired magician. Bill adjusted the black wool cap that made him resemble a burglar in a training video. “I still say this is morally complicated.” Molly, whose idea of camouflage was a raincoat the color of wet gravel, peered over the hedge with professional contempt. “Bill, we are not stealing from children. We are intercepting gifts in transition. Like luggage, but louder.” A clown-shaped piñata hung from the pergola, turning gently in the breeze as if it too distrusted the evening. Near the patio doors stood a long table stacked with brightly wrapped boxes, each one wearing an extravagant bow like a tiny hostage. Bill swallowed. “There’s a bouncy castle. Criminal law gets weird around inflatables.” “Focus,” said Molly. “Entry through the side gate. Cross the herb patch. Avoid the terrier. Bag the presents. Out before cake.” “That dog hates me on sight.” “That dog hates everyone on scent. Equality is beautiful.” Inside, tiny voices chanted for the magician. A puff of glitter shot above the fence, followed by applause and one suspicious cry of, “My eyebrows!” Molly unfolded a hand-drawn plan from her pocket. It had been sketched on the back of a takeaway menu and included arrows, timings, and a doodle of Bill looking confused. “The parents are all in the kitchen discussing schools and pretending prosecco is hydration. The children are occupied. The loot is unattended for exactly seven minutes during the rabbit finale.” Bill studied the map. “Why have you labeled the conservatory ‘zone of aunties’?” “Because no thief alive survives a concentrated auntie formation.” He nodded solemnly. This tracked with experience. Last Christmas he had lost a scarf and most of his confidence to an alert woman named Denise. Through the hedge they saw a boy in a paper crown sprint across the lawn, pursued by three smaller children and a girl dressed as a fairy who was carrying a traffic cone for reasons known only to the young. Bill sighed. “What if the presents are educational?” “Then we’re doing them a favor.” “What if one of them is a telescope? I’ve always wanted a telescope.” “Bill, you cannot start emotionally adopting stolen merchandise before we’ve stolen it.” The side gate clicked in the breeze. Molly’s eyes sharpened. This was the look she got before lockpicking, chess hustles, and once, a very successful church raffle. “Ready?” Bill flexed his fingers. “As I’ll ever be.” They slipped through the gate and into a jungle of lavender and overturned plastic cups. The air smelled of barbecue smoke, trampled grass, and that sticky, lawless sweetness unique to children’s parties. From somewhere ahead came the magician’s booming patter. “For my final trick,” he cried, “I need absolute silence!” The garden immediately became louder. Molly glided to the present table and flicked open her holdall. “Quickly.” Bill began scooping up parcels. “This one rattles.” “Take it.” “This one meows.” She paused. “Leave that one.” A shadow fell across them. Bill froze, clutching a dinosaur-wrapped box to his chest. Slowly, they looked up. A little girl in fairy wings stared at them, holding the traffic cone like a ceremonial spear. Molly smiled with dazzling calm. “Good evening. We’re from Present Security.” The girl narrowed her eyes. “That’s not a real thing.” Bill, to his own surprise, found his voice. “It is tonight.”
created_at
"2026-03-06T12:15:38.589Z"
3/6/2026, 12:15:38 PM
model
"openai/gpt-5.4-20260305"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
404
moderation_latency
(null)
generation_time
22461
tokens_prompt
77
tokens_completion
1019
native_tokens_prompt
66
native_tokens_completion
796
native_tokens_completion_images
0
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.012105
router
(null)
provider_responses
0
endpoint_id
"9ff5625c-403f-4d7f-b895-58ac7295062c"
id
"resp_032aee5abf5399960169aac56aa3048196afa97cdd33b3ac3b"
is_byok
false
latency
145
model_permaslug
"openai/gpt-5.4-20260305"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1772799338-9VRFCttZIj77ug1br6EN"
upstream_id
"resp_032aee5abf5399960169aac56aa3048196afa97cdd33b3ac3b"
total_cost
0.012105
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
600
79.89%
Dialogue to Total Word Ratio
Ratio: 32.12%, Deviation: 2.12%
neededClean
false
wordsTotal
604
wordsDialogue
194
39.9437%