NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-16-01KHK95SHDSY75M1WKFTYSCZ1G
Run Details
Model:
GPT-5.1
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
27.7s
Cost:
$0.012223
Provider:
OpenAI
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly peered through the hedge at the explosion of color on the lawn across the street. A bouncy castle billowed like a technicolor lung, children ricocheting inside it with the manic energy of popcorn kernels. Balloons bobbed, a clown honked, and a folding table sagged under the weight of wrapped boxes. “Those are the targets,” Molly said, squinting. “Gift-wrapped greed in rectangular form.” Bill adjusted his cheap party-store tiara, which sat askew on his bald head. “You sure this is morally acceptable theft?” “Absolutely,” she said. “They’re children. They don’t remember anything before age eight. Science.” “Pretty sure they remember presents,” Bill muttered. “I still remember the year I got socks.” “That’s because you’re fueled by resentment, not memory,” she said. “Focus. Step one: entry. Step two: chaos. Step three: extraction.” “Step four: outrun a mob of sugar-crazed toddlers,” Bill said. “You’re skipping the cardio.” From their vantage point, they could see the parents clustered near the grill, red cups in hand, attention split between smoke, small talk, and occasional shrieks from the inflatable battlefield. “We blend in,” Molly said. “We are enthusiastic relatives no one recognizes, but no one wants to admit they forgot to invite.” Bill frowned at his tinsel-covered vest. “I don’t look like a relative. I look like a fire hazard.” “That’s festive uncle energy,” she said. “You distract with balloon animals. I sweep the table.” “I don’t know how to make balloon animals,” Bill said. “I can make balloon existential crises. They start as a dog, end as a noodle.” “Children don’t care,” Molly said. “You twist latex and say it’s a unicorn. They’ll believe anything if there’s frosting nearby.” He glanced at the cake, a three-tiered monstrosity frosted with a cartoon dinosaur mid-roar. “What about the clown?” “What about the clown?” she asked. “The clown looks… professional,” Bill said. “He’s got that haunted, seen-too-much gaze. That’s security.” Molly followed his gaze. The clown did, in fact, radiate the weary menace of a man who had inflated a thousand balloons and buried his dreams under glitter. “If the clown comes near you,” she said, “cry.” “Cry?” “Adults panic when a grown man cries at a children’s party,” she said. “Instant distraction.” “I don’t cry on command,” Bill said. “I barely cried at my own divorce.” “Think about the socks,” she said. He paused. “All right, that might work.” She pulled two lanyards from her pocket, each with a laminated badge reading PARTY SUPPORT STAFF in a font that looked like ransom notes in rainbow. “Credentials,” she said. “No one questions a lanyard.” “Where did you get these?” Bill asked. “Same place I got your tiara,” she said. “The internet. Where morality goes to nap.” They stepped out from behind the hedge and crossed the street, assuming the natural swagger of people who belonged wherever they said they did. Children zoomed past like small missiles, leaving sugar and grass stains in their wake. “Remember,” Molly murmured, “big smiles, bigger confidence.” “My smile terrifies people,” Bill said. “Last time I smiled at a neighbor, she offered me a pamphlet.” “Just look delighted by helium,” she said. “Parents understand that.” As they approached the yard, a distracted father glanced up, saw the badges, and visibly relaxed. “Perfect,” Molly whispered. “We’re invisible.” Bill eyed the pyramid of presents, each one wrapped in bright paper and possibility. “You know,” he said, “somewhere in there is a kid’s entire year of happiness.” “Exactly,” Molly said. “We’re doing them a favor. Teaching them life’s hardest lesson early.” “What lesson is that?” he asked. She smiled, eyes on the loot. “Always label your stuff.”
created_at
"2026-02-16T13:10:57.188271+00:00"
2/16/2026, 1:10:57 PM
model
"openai/gpt-5.1-20251113"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
8654
moderation_latency
(null)
generation_time
27193
tokens_prompt
77
tokens_completion
1086
native_tokens_prompt
66
native_tokens_completion
1214
native_tokens_completion_images
0
native_tokens_reasoning
359
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0122225
router
(null)
provider_responses
0
id
"resp_0423d34abe2bec5a0169931745af2481949b7e5678775a4fbe"
status
200
is_byok
false
latency
191
endpoint_id
"764eb97f-8bab-4326-b29b-7a8799b00a70"
provider_name
"OpenAI"
model_permaslug
"openai/gpt-5.1-20251113"
api_type
"completions"
id
"gen-1771247429-ZxzM7hN8d9VLRn8hr9L7"
upstream_id
"resp_0423d34abe2bec5a0169931745af2481949b7e5678775a4fbe"
total_cost
0.0122225
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
0.0000%
Matches word count
n/a
neededClean
false
words
608
0.0000%
Dialogue to Total Word Ratio
Ratio: 50.82%, Deviation: 20.82%
neededClean
false
wordsTotal
608
wordsDialogue
309
0.0000%