NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-16-01KHK9ZMXGJ9X6TZ7GS3W3MKDR
Run Details
Model:
GPT-5.2
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
31.0s
Cost:
$0.017265
Provider:
OpenAI
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly crouched behind the hedge like a tactical garden gnome, binoculars pressed to her eyes. Across the street, the backyard blazed with balloon arches and a bouncy castle that looked like it had been inflated by pure sugar. Bill shuffled up beside her, wearing a black beanie and, inexplicably, a party hat on top of it. He held a clipboard like it was an emotional support animal. “We are not doing this,” Molly whispered. “Tell me you didn’t bring the clipboard again.” “It’s for professionalism,” Bill whispered back. “Criminals need admin. Also, the beanie is for stealth and the party hat is for—” “Blending in,” Molly finished, deadpan. “Yes, because nothing says ‘normal adult’ like a burglary accessory under festive conical shame.” Bill peeked through the hedge. “Look at the loot table. I count twelve gift bags, three big boxes, and one suspiciously lumpy present that screams ‘remote-controlled dinosaur.’” “We’re here for quantity, not quality,” Molly said. “The big boxes go first. They’re probably bikes. Bikes are currency.” Bill tapped his clipboard. “I made a list of entry vectors. Option A: hop the fence by the birdbath. Option B: pose as entertainers. Option C: pretend we’re the kid’s relatives with mild confusion.” Molly lowered the binoculars. “Option D: you stop saying ‘entry vectors’ like we’re storming a tiny fortress.” “It is a tiny fortress,” Bill argued. “They have a cotton-candy machine. That’s basically artillery.” Molly watched a clown wobble past the patio. “Okay, entertainer angle is dead. I refuse to compete with a man who can make balloon poodles while maintaining eye contact.” Bill nodded gravely. “He’s armed with whimsy. Unpredictable.” Molly adjusted her gloves. “Fence. We go in, we scoop presents, we exit. Thirty seconds.” “Thirty seconds?” Bill hissed. “Molly, this is a children’s party. They have the attention span of hummingbirds and the scream volume of air raid sirens.” “That’s why we move fast,” Molly said. “Also, see the banner? ‘NO ADULTS IN THE GIFT ZONE.’ That’s basically an invitation.” Bill squinted. “It’s probably for parents.” “Bill,” Molly said, “we are not parents.” “I could be,” Bill said, wounded. “I have paternal energy.” “You have ‘lost at IKEA’ energy,” Molly replied. “Now, what are we wearing?” Bill looked down at his black hoodie. “Black. Classic. Like ninjas.” Molly glanced at the bright yard. “In a sea of primary colors, black reads ‘suspicious bat.’ We need camouflage. We need… clown-proofing.” Bill’s eyes lit up. “We dress as parents!” Molly grimaced. “What does a parent look like?” “Stressed. Hydrated. Carrying a tote bag,” Bill said. “Also, sunglasses indoors, because they haven’t slept since 2019.” Molly’s mouth twitched. “Fine. I’ll do ‘mysterious aunt.’ You do ‘dad who forgot the RSVP.’” Bill nodded and produced two items from his backpack: a wrinkled tote bag that said LIVE LAUGH LASAGNA and a juice box. “Why do you have a juice box?” Molly asked. Bill held it up like a badge. “It’s social currency. If anyone questions us, I offer it and say, ‘We’re with the birthday kid.’ Boom. Instant access.” Molly stared. “That’s… actually smart.” Bill preened. “Thank you. Finally, recognition.” Molly pointed to the gift table visible through the open gate. “Route: gate, table, tote bag, out. No detours.” “What about a cupcake?” Bill asked softly. “No,” Molly said. “One cupcake,” Bill pleaded. “For morale.” Molly leaned in. “Bill, our morale is stealing presents from toddlers. It doesn’t need frosting.” Bill sighed. “Okay. But if I see a piñata, I’m not responsible for my choices.” They rose from the hedge, attempting casual parent posture. Molly walked with purposeful confusion; Bill clutched the tote bag and juice box like he’d been born to make small talk. “Remember,” Molly muttered, “if anyone asks who we are, we say we’re from… the school.” Bill nodded. “Which school?” Molly’s eyes narrowed. “The one with children, Bill. Just smile and say ‘Wow, they’re getting so big.’” Bill took a breath. “I can do this.” They approached the gate. A blast of laughter and bubble-machine foam drifted over them like a warning. Bill whispered, “If we pull this off, we retire.” Molly whispered back, “If we pull this off, we move. Far. Somewhere without balloon arches.” Bill lifted the juice box. “For the record, I think we’re terrible people.” Molly tightened her grip on the tote bag. “Agreed. Now go be a dad.”
created_at
"2026-02-16T13:25:07.88911+00:00"
2/16/2026, 1:25:07 PM
model
"openai/gpt-5.2-20251211"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
3917
moderation_latency
(null)
generation_time
30986
tokens_prompt
77
tokens_completion
1305
native_tokens_prompt
66
native_tokens_completion
1225
native_tokens_completion_images
0
native_tokens_reasoning
142
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0172655
router
(null)
provider_responses
0
id
"resp_030f0c35a5212f290169931a94792c81959c50ed1368ab9ceb"
status
200
is_byok
false
latency
175
endpoint_id
"f00142c2-6a93-49ce-9e36-5593b904ce3b"
provider_name
"OpenAI"
model_permaslug
"openai/gpt-5.2-20251211"
api_type
"completions"
id
"gen-1771248276-OZdQkdzafnLkH2fxLWTF"
upstream_id
"resp_030f0c35a5212f290169931a94792c81959c50ed1368ab9ceb"
total_cost
0.0172655
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
0.0000%
Matches word count
n/a
neededClean
false
words
728
0.1096%
Dialogue to Total Word Ratio
Ratio: 61.68%, Deviation: 11.68%
neededClean
false
wordsTotal
728
wordsDialogue
449
0.0548%