NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-24-01KQ0N78TSF82V7JF59GXQM067
Run Details
Model:
GPT-5.5
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
26.7s
Cost:
$0.030690
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.57%
Dialogue to Total Word Ratio
(details)
0.2828%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly unfolded the napkin-map across the café table with the solemnity of a general invading a bouncy castle. “Target is 14 Sprinkles Lane,” she said. “Time: Saturday, fourteen hundred hours. Occasion: six-year-old’s birthday. Presents: approximately thirty-seven, plus one suspiciously large box shaped like a pony.” Bill squinted. “You think there’s a pony in the box?” “I think there’s opportunity in the box.” “Molly, we’re thieves, not monsters. You want to steal birthday presents from children?” “Temporarily relocate,” Molly said. “We’ll ransom them back for cake.” “That is worse.” “It is innovative.” Bill pushed his tea away. “No. Absolutely not. I draw the line at party hats.” “You wore a party hat at the bank job.” “That was a disguise.” “It said ‘Birthday Princess.’” “And nobody suspected me.” Molly tapped the napkin. “Listen. We infiltrate through the side gate. You distract the entertainer.” “What entertainer?” “Captain Giggles.” Bill went pale. “A clown?” “A budget clown.” “That’s worse. A rich clown has standards. A budget clown has nothing to lose.” “You’re not fighting him. You’re distracting him.” “With what?” “Balloon animal challenge.” “I can only make a snake.” “That’s perfect. Make fifty snakes. He’ll be threatened artistically.” Bill leaned back. “And while I’m locked in mortal balloon combat, you sneak inside and steal from toddlers?” “They’re six. Practically adults.” “Molly.” “What?” “No.” She sighed. “Fine. We don’t steal all the presents.” “Good.” “We steal the best presents.” “No.” “The pony box only?” “No.” “The cake?” Bill paused. “What kind of cake?” “Chocolate volcano with raspberry lava.” He closed his eyes. “You should have led with that.” “So we steal the cake.” “We borrow a slice.” “We liberate several slices.” “Molly.” “All right. We attend as guests.” “We weren’t invited.” “We wear name tags.” “That’s not how invitations work.” “It’s how networking works.” Bill rubbed his temples. “Why are we even doing this?” Molly lowered her voice. “Because inside that house is the prize.” “The pony box?” “The Silver Squirrel.” Bill blinked. “The antique figurine?” “Hidden among the gifts. The father collects novelty trophies. Our client says it’ll be wrapped in dinosaur paper.” “So we’re not stealing toys?” “We’re stealing one ugly squirrel.” “From a child’s party.” “From a mantelpiece, technically. Wrapped by mistake, allegedly.” Bill considered this. “Less evil.” “Practically noble.” “Still involves Captain Giggles.” “Every profession has hazards.” At the next table, a toddler stared at them while chewing a biscuit into paste. Bill whispered, “We’re being watched.” Molly smiled sweetly at the child. “Afternoon, civilian.” The toddler raised one damp fist and said, “Mine.” Bill shuddered. “They know.” Molly folded the napkin. “Then we move quickly. We arrive with a gift.” “What gift?” “Empty box, unicorn paper. Classic cover.” “That’s sad.” “It’s tactical.” “We put an actual gift in it.” “Bill, that defeats the—” “A nice one. Blocks or something.” Molly studied him. “You want to bring a present to the party we’re burgling?” “Yes. Manners cost nothing.” “They cost blocks.” “I’m buying blocks.” She groaned. “Fine. We bring blocks. We find dinosaur paper. We swap the squirrel with the blocks.” “That leaves the child with blocks instead of a silver squirrel.” “An improvement by any measure.” “And we do not touch the other presents.” “Not even the pony box?” “No.” “Can we peek?” “No.” “Shake it?” “No.” “Ask it questions?” Bill stood, grabbing his coat. “We are going to a toy shop, then to a children’s party, then we are stealing one squirrel and leaving.” Molly followed, eyes bright. “And cake?” “One slice each.” “Two if Captain Giggles cries?” “If Captain Giggles cries, we run.” Outside, rain glittered on the pavement like spilled jewels nobody had bothered to fence. Molly linked her arm through Bill’s. “You know, for a thief, you’re very ethical.” “And for a mastermind, you’re alarmingly interested in ponies.” “That box is too large, Bill.” “It is not a pony.” “It could be a small pony.” “It’s not.” “Miniature horse?” “No.” “Emotional support donkey?” Bill stopped walking. “If there is a donkey in that box, we steal the donkey before the children can name it.” Molly grinned. “Now you’re thinking like a professional.”
created_at
"2026-04-24T21:09:37.251Z"
4/24/2026, 9:09:37 PM
model
"openai/gpt-5.5-20260423"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
510
moderation_latency
(null)
generation_time
26673
tokens_prompt
77
tokens_completion
1450
native_tokens_prompt
66
native_tokens_completion
1012
native_tokens_completion_images
0
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.03069
router
(null)
provider_responses
0
endpoint_id
"58e5b336-423e-430b-a2ab-8bc353f0c51b"
id
"resp_07b2801bc2c510c40169ebdc1151808190aabdc745b70407bc"
is_byok
false
latency
219
model_permaslug
"openai/gpt-5.5-20260423"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777064977-mghPMa4QnA4L37M84M0z"
session_id
(null)
api_type
"completions"
id
"gen-1777064977-nT1aH31MFSttssQhabSb"
upstream_id
"resp_07b2801bc2c510c40169ebdc1151808190aabdc745b70407bc"
total_cost
0.03069
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
687
0.57%
Dialogue to Total Word Ratio
Ratio: 80.17%, Deviation: 10.17%
neededClean
false
wordsTotal
691
wordsDialogue
554
0.2828%