NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-16-01KHKAHNQ70NVF6WVZDGGVEWZ6
Run Details
Model:
GPT-5.2
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
1m 15s
Cost:
$0.060665
Provider:
OpenAI
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Bill parked the battered ice-cream van across from the bungalow where balloons trembled like nervous jellyfish. Through the windshield Molly watched parents ferrying paper plates and children into the yard, a glittery banner screaming HAPPY EIGHTH, LUCY! The air smelled of hot dogs and sanitizer. “We’re really doing this,” Bill whispered, adjusting the fake moustache taped under his nose. “We’re not stealing from children,” Molly said. “We’re stealing from their optimistic relatives. Different demographic.” Bill squinted at the driveway. “There’s a clown.” “Clowns are just burglars with better marketing,” Molly replied, flipping open her notebook. It contained a sketch labeled PRESENT PILE, surrounded by arrows and a doodle of Bill falling into a kiddie pool. “Focus. We go in as entertainers, locate the stash, and exit with maximum loot and minimum emotional scarring.” Bill tugged at the zipper of his oversized mascot suit, which made him look like a depressed squirrel. “My tail keeps drooping. It’s unprofessional.” “Your whole face is unprofessional,” Molly said. “Remember the plan: we mingle, we charm, we acquire.” Bill cleared his throat. “What if someone asks for balloon animals?” “Then you make them,” Molly said. “We are adaptable criminals.” He stared at her. “I can’t even make a regular animal.” “Make a balloon regret. Kids love abstract art.” Bill winced at the shrieking in the yard. “Too many witnesses.” “All witnesses are short,” Molly said. “They have terrible sightlines. Plus, they’re fueled by cake. We’ll be invisible.” Bill opened the glove compartment and produced two party hats and a gift bag that said IT’S YOUR DAY! in cheerful font. “Props.” Molly nodded approvingly. “Good. Carry the bag like you mean it. Confidence is ninety percent of theft.” They climbed out, and the van door squealed like a guilty conscience. Molly smoothed her polka-dot dress—thrifted, festive, and strategically roomy. Bill waddled beside her, the squirrel head under one arm, sweating sincerity. At the gate, a woman with a clipboard smiled. “Entertainment?” Molly beamed. “Yes, ma’am. I’m Molly the Magnificent and this is Bill… the Nut-Based Hero.” Bill saluted with the squirrel paw. “Chitter.” They were waved in. Music thumped. A kid chased another kid with a foam sword, and the clown did a backflip. Molly leaned close. “Locate the presents.” Bill scanned. “By the table. Under the tarp with unicorns.” “Perfect,” Molly murmured. “We wait for the piñata. Everyone looks up. We look down. Society’s flaw.” Bill shifted the gift bag. “How many can we carry?” “All of them,” Molly said, watching a toddler toddle into the pile and emerge wearing a bow like a crown. “Greed is a muscle. We’re exercising.” Bill hesitated. “This feels… mean.” Molly softened, just a fraction. “Bill, we’re professionals. We take the goods, not the joy. They’ll still have cake, and trauma builds character.” The father shouted, “Piñata time!” Children surged toward the tree. Adults lifted phones. Molly’s eyes sharpened. “Showtime.” Bill swallowed. “If we get caught, I’m blaming the squirrel.” “You are the squirrel,” Molly said, grabbing the tarp edge. “Now smile. We’re making memories.”
created_at
"2026-02-16T13:35:43.150261+00:00"
2/16/2026, 1:35:43 PM
model
"openai/gpt-5.2-20251211"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
6528
moderation_latency
(null)
generation_time
75232
tokens_prompt
77
tokens_completion
1383
native_tokens_prompt
66
native_tokens_completion
4325
native_tokens_completion_images
0
native_tokens_reasoning
3583
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0606655
router
(null)
provider_responses
0
id
"resp_006ccb107a3a63690169931ce322488193b2007ad3a5f00d3a"
status
200
is_byok
false
latency
172
endpoint_id
"f00142c2-6a93-49ce-9e36-5593b904ce3b"
provider_name
"OpenAI"
model_permaslug
"openai/gpt-5.2-20251211"
api_type
"completions"
id
"gen-1771248867-8moTy94glSmsncMqs90g"
upstream_id
"resp_006ccb107a3a63690169931ce322488193b2007ad3a5f00d3a"
total_cost
0.0606655
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
90.4837%
Matches word count
n/a
neededClean
false
words
510
0.0047%
Dialogue to Total Word Ratio
Ratio: 44.12%, Deviation: 14.12%
neededClean
false
wordsTotal
510
wordsDialogue
225
45.2442%