NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-06-01KK1H4ASJVEXPVDD58MHKZY86
Run Details
Model:
GPT-5.4
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
26.4s
Cost:
$0.013860
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
7.51%
Dialogue to Total Word Ratio
(details)
3.7552%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
“Tell me again,” said Molly, squinting over the rim of a chipped teacup, “why we are robbing a birthday party instead of the mayor’s safe, the museum vault, or literally any location not decorated with a papier-mâché dragon?” Bill spread a crayon-scribbled map across the café table. “Because, my doubting nightingale, the dragon is a distraction. Everyone looks at the dragon. Nobody looks at the presents.” “The presents,” Molly repeated, “which are wrapped in shiny paper and guarded by six-year-olds with no impulse control and sticky hands.” “Exactly. Amateurs. Chaotic. Unpredictable. The perfect cover.” “You say that like jam fingerprints are an asset.” “They are if they’re not ours.” Molly leaned in. “Start from the top.” Bill tapped the map. “The party begins at two. By two-fifteen, there’s a magician.” “A bad one?” “The best kind. Balloon animals, card tricks, dramatic shouting. All eyes front. At two-twenty, the cake emerges.” “Ah,” said Molly. “The ceremonial lowering of vigilance.” “Correct. Adults get weepy over candles. Children become tiny arson enthusiasts. In the confusion, we enter through the side gate dressed as entertainers.” “Entertainers?” Bill grinned. “I’ve acquired costumes.” “If you say clown, I leave.” “I’m insulted. Clowns are chaos without discipline. No, you’ll be a princess.” Molly stared at him. “I beg your pardon.” “And I,” Bill said with dignity, “shall be a wizard.” “You look like a man who loses arguments with umbrellas.” “A wizard is mostly confidence and a sleeve.” Molly pinched the bridge of her nose. “Fine. We get in. Then what?” “Gift table by the garden wall. We load the presents into the big toy chest near the bouncy castle.” “The toy chest that belongs to the party?” “Yes, but only temporarily. We wheel it out to the van.” “Won’t anyone notice a toy chest full of presents rolling away?” Bill lifted one finger. “Not if I shout, ‘Stand back, emergency dragon procedure.’” Molly was silent a moment. “There’s no emergency dragon procedure.” “Which is why it commands respect.” A waiter passed, eyeing the map, which included a stick figure wearing a crown and several arrows labeled CAKE PANIC. Molly flipped it over. “Bill, children are observant in all the wrong ways. One of them will ask why the princess has lock picks.” “I thought we’d call them magic hairpins.” “One will ask why the wizard smells like engine grease.” “That,” Bill said, “is potion residue.” She sighed. “And suppose the parents intervene?” “We charm them.” “You can’t charm suburban fathers.” “Then we confuse them. Same result, slower route.” Molly looked out the window, where rain drizzled over the street in thin silver lines. “I used to have standards.” “You still do. They’re just more portable.” “What’s in these presents, anyway?” Bill’s eyes gleamed. “Remote-control cars. A games console. Possibly jewelry from inattentive aunts. At minimum, thirty percent of the haul will be batteries.” “That is the least romantic sentence you’ve ever said.” “It’s a practical crime, Molly.” “It’s a ridiculous crime.” “The best crimes are. Serious crimes get caught by serious people.” “And ridiculous crimes?” “Slip through because nobody wants to admit they were outwitted by a wizard and a princess stealing birthday loot.” Molly considered that, against her better judgment. “What if the child opens everything early?” “Then we improvise.” “With what?” “With my second plan.” “You have a second plan?” Bill folded his hands. “We win the pass-the-parcel and take the grand prize.” “That’s not a heist. That’s participation.” “Participation with intent.” Molly laughed despite herself, low and dangerous. “All right. I’ll be the princess. But if you make me wear a tiara, I’m taking your share.” Bill reached under the table and produced a glittering crown. “Counteroffer: matching tiaras.” She stared at it, then at him. “You know,” she said, smiling in a way that usually preceded felony, “emergency dragon procedure might just work.”
created_at
"2026-03-06T12:14:33.785Z"
3/6/2026, 12:14:33 PM
model
"openai/gpt-5.4-20260305"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
481
moderation_latency
(null)
generation_time
26298
tokens_prompt
77
tokens_completion
1208
native_tokens_prompt
66
native_tokens_completion
913
native_tokens_completion_images
0
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.01386
router
(null)
provider_responses
0
endpoint_id
"9ff5625c-403f-4d7f-b895-58ac7295062c"
id
"resp_029964cc205f722f0169aac529d8c8819486f301ed9151fc25"
is_byok
false
latency
183
model_permaslug
"openai/gpt-5.4-20260305"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1772799273-BYHPX9kus94sv42dGPrE"
upstream_id
"resp_029964cc205f722f0169aac529d8c8819486f301ed9151fc25"
total_cost
0.01386
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
641
7.51%
Dialogue to Total Word Ratio
Ratio: 77.20%, Deviation: 7.20%
neededClean
false
wordsTotal
649
wordsDialogue
501
3.7552%