NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-16-01KHK9YCTCB4YT1078E55TDW9A
Run Details
Model:
GPT-5.2
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
38.2s
Cost:
$0.021256
Provider:
OpenAI
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
“Tell me again why we’re robbing a children’s party,” Molly said, peering through the van’s windshield at the suburban backyard bursting with streamers. “Because it’s the only place in town where people hand over valuables already wrapped,” Bill said. “It’s basically self-serve capitalism with frosting.” “You’re calling a gift table ‘self-serve’ like it’s a salad bar.” “It is. Except the croutons are toy robots and the lettuce is an iPad.” Molly adjusted her beanie. “And you’re sure the parents won’t notice two adults carrying out an armful of presents?” Bill tapped the clipboard on his lap. “That’s why we have the clipboard. Nobody questions a clipboard. A clipboard is like an invisibility cloak for idiots.” “I’m the idiot?” “No, the clipboard is. It distracts them. Look official, move briskly, say a word like ‘inventory,’ and people start nodding like they understand math.” Molly watched a clown stumble past the patio doors. “There’s a clown. I hate clowns.” “Clowns are our cover,” Bill said. “We walk in with the clown energy, walk out with the wrapped goods.” “I don’t have clown energy. I have ‘woman at DMV’ energy.” “Perfect. DMV energy is scarier than clowns. Parents will part like the Red Sea.” Molly pulled a pink gift bag from the back seat and shook it. “Explain the decoy bag plan again. I don’t want to improvise in front of a piñata.” “We go in as the Gift Relocation Specialists,” Bill said. “You carry this decoy bag. I carry the laundry hamper. We say, ‘Just consolidating the presents to avoid tripping hazards.’” Molly squinted. “A hamper?” “Trust me. A hamper says, ‘I’m helping,’ and also, ‘I’m leaving.’ It’s the Swiss Army basket.” “What if someone asks who hired us?” Bill lifted his chin. “We say, ‘Karen did.’ There’s always a Karen. If there isn’t, someone will volunteer to become one.” Molly listened to shrieking laughter. “I can hear them doing that party game where you pin something on something.” “Pin the Tail on the Donkey,” Bill said. “Right. They’re already practicing being lied to while blindfolded. Great. It’s like training for adulthood.” Bill checked his watch. “We have a twenty-minute window. Cake time. Adults are distracted by buttercream. Kids are distracted by sugar. It’s a society of tiny drunks.” “And then,” Molly said, “we just… take everything.” “Not everything,” Bill corrected. “We have standards. We leave the homemade stuff. Anything knitted, anything with a child’s name in glitter glue, and any gift that makes a noise when you tilt it.” “Why the noisy ones?” “Because last time, the talking dinosaur narrated my escape. ‘I’M A T-REX!’ Yeah, thanks, buddy, tell the neighbors.” Molly leaned closer. “What about the big presents? Bikes?” “We don’t take bikes,” Bill said. “Bikes are too visible. Also, I don’t want to be chased by a dad in cargo shorts with a righteous jog.” Molly nodded. “Okay, hamper for small-to-medium loot. What’s our exit?” Bill pointed at the side gate. “We say, ‘We’re moving the gifts to the car for safekeeping.’ Then we move them to our car. For keeping.” “Safekeeping,” Molly repeated. “We are so technically honest it’s almost rude.” Bill opened the van door a crack. “Remember: brisk steps, neutral face, clipboard confidence.” Molly took a deep breath. “If anyone stops me, I’m saying I’m with the clown.” Bill grimaced. “Don’t say you’re with the clown. That sounds like a personal crisis.” “Fine,” Molly said. “I’ll say I’m with ‘present logistics.’” “And if they ask for ID,” Bill said, “you show them the sticker sheet.” Molly held up a sheet of gold stars. “This is your plan for security clearance?” Bill smiled. “Gold stars have ended more arguments than the legal system.” Molly tucked the stars into her pocket. “Alright. Operation Tiny Wrapping Paper is a go.” Bill grabbed the hamper. “For the record, we’re monsters.” “Yep,” Molly said, stepping out. “Efficient, clipboard-wielding monsters with excellent posture.”
created_at
"2026-02-16T13:24:34.050008+00:00"
2/16/2026, 1:24:34 PM
model
"openai/gpt-5.2-20251211"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
5632
moderation_latency
(null)
generation_time
38114
tokens_prompt
77
tokens_completion
1573
native_tokens_prompt
66
native_tokens_completion
1510
native_tokens_completion_images
0
native_tokens_reasoning
554
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0212555
router
(null)
provider_responses
0
id
"resp_0265cd4991d9fdac0169931a6b76c48196a1627186470a4043"
status
200
is_byok
false
latency
420
endpoint_id
"f00142c2-6a93-49ce-9e36-5593b904ce3b"
provider_name
"OpenAI"
model_permaslug
"openai/gpt-5.2-20251211"
api_type
"completions"
id
"gen-1771248235-rIPQyrU8CINhb2CcySrP"
upstream_id
"resp_0265cd4991d9fdac0169931a6b76c48196a1627186470a4043"
total_cost
0.0212555
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
0.0000%
Matches word count
n/a
neededClean
false
words
656
0.7606%
Dialogue to Total Word Ratio
Ratio: 79.88%, Deviation: 9.88%
neededClean
false
wordsTotal
656
wordsDialogue
524
0.3803%