NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-16-01KHKA6PT4FAPT8DMMCGRJG9FA
Run Details
Model:
GPT-5.2
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
29.7s
Cost:
$0.016397
Provider:
OpenAI
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly unfolded the crinkled flyer on the hood of Bill’s car like it was a sacred map. Glitter stuck to her thumb. The minivan across the street pulsed with a balloon bouquet that looked armed and ready. “This is our score?” Bill said. “A birthday party?” “It’s not a birthday party,” Molly said. “It’s a children’s party. There’s a difference.” Bill squinted at the flyer. “It says ‘Harper turns six.’ That’s a birthday.” Molly jabbed a finger at the fine print. “Right there. ‘No gifts necessary, but they are appreciated.’ That’s code.” “Code for what?” “Code for ‘There will be gifts.’” Bill leaned back, chewing on an imaginary toothpick he hadn’t had since 2019. “I prefer adults. Adults have wallets.” “Adults have suspicion,” Molly said. “Kids have piñatas. Kids have distraction. Kids have their faces painted into tigers. We walk in with a sheet cake and a confident stride and we’re basically invisible.” Bill looked through the windshield at the yard. A banner sagged between two trees: HARPERSAURUS. “Confident stride,” he repeated. “I’m wearing soft-sole shoes. I can’t stride. I glide like a nervous manatee.” “Then glide,” Molly said. “Step one: infiltration. Step two: acquisition. Step three: exfiltration.” Bill nodded. “And step four: do we feel bad?” Molly folded the flyer with surgical precision. “No. We’re not stealing from children. We’re stealing from a system that has normalized gift registries for people who can’t read.” Bill pointed. “There’s a balloon arch. That balloon arch has a bouncer vibe.” “That’s decor,” Molly said. “Balloons are nature’s least intimidating security measure. Unless you’re afraid of clowns. Are you afraid of clowns?” “I’m afraid of anything that honks,” Bill said. “Geese. Cars. Party noses. My own laugh.” Molly slipped a canvas tote from the backseat and unzipped it. Inside were two clip-on bow ties, a roll of wrapping paper, and a laminated card that read PARTY HELPER in bubble letters. Bill stared. “You made badges?” “I didn’t make badges,” Molly said. “I made credentials. We are professional party logistics.” “We are thieves,” Bill said. “We contain multitudes,” Molly replied. Bill took the bow tie and held it up like it might bite him. “What’s my role?” “You’re the ‘uncle.’” Bill blinked. “Which uncle?” “Any uncle. The nice one. The one who doesn’t ask questions and always offers to carry things. People love an uncle who carries things.” Bill practiced a smile that looked like he’d just been told about taxes. “Like this?” “Like you’re happy to be alive,” Molly said. “I am not,” Bill said, then tried again. “There. Alive-ish.” Molly peered toward the front door. A woman in an apron hauled a tray of cupcakes outside, the frosting piled high like pastel crime. “Okay. Presents will be near the table. We go in when the games start. Sack race, musical chairs—chaos window.” Bill tapped the steering wheel. “And the exit?” “Side gate,” Molly said. “We load into the tote. We don’t take the big ones.” “Why not the big ones?” Bill asked. “Because big ones are for amateurs and bodybuilders,” Molly said. “We take high value, low volume. Gift cards. Tiny electronics. Anything that sounds expensive when you shake it.” Bill frowned. “What if it’s a pet?” “Then we don’t steal it,” Molly said. “We have standards.” Bill sighed. “How do we explain being there?” Molly leaned closer, her voice dropping like a curtain. “You say, ‘I’m here to help.’ No one argues with help. Help is a magic word. Especially at a children’s party. Parents are one spilled juice box away from joining a monastery.” Bill considered that. “What about the kids?” “We avoid eye contact,” Molly said. “Kids can smell fear and dishonesty.” “I can smell dishonesty,” Bill said. “That’s because you’re dishonest,” Molly replied. Bill adjusted the bow tie with trembling fingers. “Okay. In, out, presents, no pets.” Molly tucked the credentials into her pocket and opened the car door. “Also: if you see a clown, you take point.” Bill froze. “Excuse me?” Molly grinned. “You said you were afraid. Fear makes you fast.” Bill groaned. “If I die at Harper’s sixth birthday, I want it on my tombstone.” “What, ‘Loved by none, defeated by balloons’?” Molly asked. Bill swallowed. “Just promise me we won’t steal anything with glitter.” Molly shut the car door softly. “Bill, we’re already covered in glitter. That’s the real heist.”
created_at
"2026-02-16T13:28:57.92892+00:00"
2/16/2026, 1:28:57 PM
model
"openai/gpt-5.2-20251211"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
3050
moderation_latency
(null)
generation_time
29616
tokens_prompt
77
tokens_completion
1421
native_tokens_prompt
66
native_tokens_completion
1163
native_tokens_completion_images
0
native_tokens_reasoning
102
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0163975
router
(null)
provider_responses
0
id
"resp_044f6a68517f76310169931b7bd25c8197aeeaefd8daa5e746"
status
200
is_byok
false
latency
155
endpoint_id
"f00142c2-6a93-49ce-9e36-5593b904ce3b"
provider_name
"OpenAI"
model_permaslug
"openai/gpt-5.2-20251211"
api_type
"completions"
id
"gen-1771248507-Jgl0IbDYzRIeG1NvQxPK"
upstream_id
"resp_044f6a68517f76310169931b7bd25c8197aeeaefd8daa5e746"
total_cost
0.0163975
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
0.0000%
Matches word count
n/a
neededClean
false
words
730
0.0592%
Dialogue to Total Word Ratio
Ratio: 62.19%, Deviation: 12.19%
neededClean
false
wordsTotal
730
wordsDialogue
454
0.0296%