NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-09-01KK9PX865DK7R2AVK5HW7FW33
Run Details
Model:
GPT-5.4 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
30.9s
Cost:
$0.016365
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly studied the house from behind a hedge clipped into the shape of an elephant, though the elephant looked alarmed, as if it too had heard there would be twenty-seven children inside and no adult supervision worth trusting. Balloons bobbed at the gate. A banner drooped across the porch, declaring HAPPY SEVENTH BIRTHDAY TO MAX in letters so cheerful they felt sarcastic. Somewhere beyond the windows, a man in a pirate hat was losing control of a game involving beanbags and screaming. Bill crouched beside her, wearing black gloves, a dark coat, and a paper party crown he had found in the gutter and insisted was “blending in.” “This,” Molly said, “is either our finest hour or a new low in a career already shaped like a cellar.” Bill adjusted the crown. “I prefer to think of it as a retail redistribution event with cake.” She opened her satchel and consulted a notebook filled with sketches, timings, and one angry doodle of a clown. Molly liked plans. Plans had labels. Plans did not improvise. Bill, by contrast, considered improvisation a lifestyle and instructions a form of pessimism. “Observe,” she murmured. “The presents are stacked under the dining room window. Bright paper, ribbons, at least three suspiciously expensive boxes. The children are occupied. The adults are distracted. One magician has clearly lost the room.” Bill peeked over the hedge. “I respect the ambition. Robbing a children’s party takes nerve.” “It takes precision.” “It takes a heart two sizes too small.” “Yours is tiny enough for both of us.” From inside came the shriek of a whistle, a crash, and a chorus of delighted little voices chanting something that sounded distressingly like “Again! Again!” Molly grimaced. Children, she felt, were essentially very short riots in formal shoes. Bill rubbed his hands together. “So what’s the approach? We mingle? I can mingle. I have a friendly face.” “You have a face people remember to warn others about.” He looked wounded. “That is deeply unfair and also, in several districts, accurate.” Molly tapped the notebook. “We go in as entertainers’ assistants. There’s already a pirate, a magician, and someone in a dragon costume sweating through the neck. Chaos is our accomplice.” Bill considered this. “Do I have to assist a magician?” “You have to carry a coil of rope and look incompetent. I promise this won’t strain you.” He gave her a thin smile. “And you?” “I’ll be a face painter.” Bill blinked. “You can paint faces?” “I can paint a convincing tiger and an even more convincing expression of authority.” A small boy ran past the front window wearing fairy wings and a mustache drawn in green marker. Molly pointed grimly. “That, for example, is the work of an amateur.” Bill leaned back against the hedge. “One concern.” “Only one? You’re growing.” “What do we do if the birthday child catches us with the loot?” Molly closed the notebook with a snap. “Then we congratulate him on his excellent taste, tell him the presents are being moved for dramatic effect, and leave before anyone asks questions.” Bill nodded solemnly. “And if there’s a bouncy castle?” “We avoid it.” “If there’s cake?” She hesitated, which Bill noticed at once. “Aha,” he said. “So we’re stealing presents and dessert.” “We are not stealing dessert.” “We’re borrowing joy.” “We are committing a limited, tasteful felony.” Bill rose, straightened his coat, and tilted the paper crown at a rakish angle. “Molly, when history remembers us, I hope it says we had standards.” Molly stood too, smoothing her sleeves, eyes fixed on the mountain of gift-wrapped temptation beyond the glass. “History,” she said, “will be lucky if it gets the ribbons right.” Then, with the solemnity of diplomats and the morals of raccoons, they strolled toward the gate, smiling as if they belonged there.
created_at
"2026-03-09T16:29:28.652Z"
3/9/2026, 4:29:28 PM
model
"openai/gpt-5.4-20260305"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
10938
moderation_latency
(null)
generation_time
30831
tokens_prompt
77
tokens_completion
1316
native_tokens_prompt
66
native_tokens_completion
1080
native_tokens_completion_images
0
native_tokens_reasoning
239
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.016365
router
(null)
provider_responses
0
endpoint_id
"9ff5625c-403f-4d7f-b895-58ac7295062c"
id
"resp_0997213abbd85dfa0169aef568bab4819684a1fe96c51efe83"
is_byok
false
latency
182
model_permaslug
"openai/gpt-5.4-20260305"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773073768-0CBUaZ6oMioSJDhaMv0K"
upstream_id
"resp_0997213abbd85dfa0169aef568bab4819684a1fe96c51efe83"
total_cost
0.016365
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
636
0.00%
Dialogue to Total Word Ratio
Ratio: 49.37%, Deviation: 19.37%
neededClean
false
wordsTotal
638
wordsDialogue
315
0.0000%