NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-16-01KHKAPA6K67BSCG4CMNC3FZR8
Run Details
Model:
GPT-5.1
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
41.4s
Cost:
$0.016792
Provider:
OpenAI
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
“Tell me again why we’re robbing a birthday party, Bill.” “Because jewelry stores have alarms and toddlers only have frosting.” “Frosting is sticky, and so is prison, probably.” “They’re five-year-olds, Molly. Their most advanced security system is nap time.” “Nap time is when adults drink coffee and notice missing things.” “They won’t notice presents gone. They’ll blame a magician, or Santa, or capitalism.” “It’s June. Santa has an alibi.” They sat in Bill’s sputtering hatchback across from a suburban house vibrating with sugar and shrieks. Balloons bobbed on the mailbox like tiny pastel surveillance drones. Molly narrowed her eyes. “You see that inflatable unicorn in the yard?” “Yeah. What about it?” “It’s judging me, Bill. With its rainbow disapproval.” “That’s just the air pump failing. Look, timetable: clown at two, cake at two-thirty, present opening at three. We hit between clown and cake.” “So we’re robbing children in the gap between horror and diabetes.” “Exactly. It’s the perfect crime window.” Inside, a chorus of kids screamed at a volume normally reserved for plane engines. Molly winced. “Sounds like they’re already being punished. Maybe we should rob someone less tragic, like a bank.” “The host mom hired a professional face painter,” Bill said, consulting his crumpled flyer. “Professional. That means money. Money means expensive presents.” “Or just really committed glitter.” “Either way, we’re walking out with something wrapped.” Molly drummed her fingers on the dashboard. “What about parents with camera phones? I do not consent to being captured mid-heist in 4K.” “You’ll be in costume.” “Disguise is not a force field, Bill.” “We blend in, grab the presents, roll the wagon out, disappear.” “Back up. Wagon?” Bill bent into the backseat and hauled up a folded red plastic wagon. One wheel squeaked. “You expect me to flee the scene of the crime in a Fisher-Price getaway car?” “It’s a children’s party, Molly. We can’t wheel out a steel dolly like we’re moving corpses.” “I’d prefer something that doesn’t say ‘My First Felony’ on the side.” Bill shrugged. “It says ‘Adventure Express.’ Same thing, different font.” A sudden shriek of “PINATA!” drifted from the backyard. Molly stared at the fence. “You hear that? That’s the sound of moral boundaries shattering.” “Focus. Costumes in the trunk. Yours is in the glittery bag.” She opened it, then froze. “Bill. Why is there a tutu in here?” “You’re Princess Sparkle.” “I’m a career criminal.” “Who is, today, Princess Sparkle, the birthday assistant.” “And you?” Bill pulled on a sagging dog costume head. “I’m Buster the Birthday Pup.” Molly pinched the bridge of her nose. “I knocked off a casino in Reno. A casino. Now I’m emotional support for a pinata.” “Crime is about adaptation.” “Crime is about dignity.” He looked down at the furry paws. “That ship sailed when I bought the tail.” They changed in the cramped car, elbows jabbing, static in the tulle. When Molly emerged, the tutu glowed with weaponized pink. Bill grinned from inside the dog head. “You look… inconspicuous.” “I look like I lost a bet with a unicorn.” “They’ll never suspect you.” “They’ll never respect me.” Music thumped from the backyard—an upbeat song about friendship and sharing. Molly squinted toward the noise. “Irony’s really laying it on thick today.” “Okay,” Bill said. “We ring the bell. You say you’re the backup entertainment.” “And if they ask what happened to the main entertainment?” “Budget cuts.” “Then we mingle, assess the present table, locate the most expensive-looking wrapping, and…” She mimed scooping. “Redistribute to the needy.” “We’re the needy.” “Details.” They stepped out of the car. Balloons bobbed, kids shrieked, a bubble machine exhaled sticky joy across the lawn. Molly hesitated at the walkway. “Bill?” “Yeah?” “If a kid cries while we’re stealing their gifts, I reserve the right to abort.” “Define ‘cries.’ These are children. Their baseline is wailing.” “Like, the special kind. The ‘learned about mortality’ cry.” He considered. “Fair. No existential tears.” She nodded, then squared her glittery shoulders. “All right. Let’s go disappoint some tiny capitalists.”
created_at
"2026-02-16T13:37:41.217452+00:00"
2/16/2026, 1:37:41 PM
model
"openai/gpt-5.1-20251113"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
9118
moderation_latency
(null)
generation_time
41345
tokens_prompt
77
tokens_completion
1654
native_tokens_prompt
66
native_tokens_completion
1671
native_tokens_completion_images
0
native_tokens_reasoning
690
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0167925
router
(null)
provider_responses
0
id
"resp_080e2ea95ba5c14d0169931d7b2dd88196bd42d172adad0d95"
status
200
is_byok
false
latency
154
endpoint_id
"764eb97f-8bab-4326-b29b-7a8799b00a70"
provider_name
"OpenAI"
model_permaslug
"openai/gpt-5.1-20251113"
api_type
"completions"
id
"gen-1771249019-6IoFNLiIMTE9dDFuVVoh"
upstream_id
"resp_080e2ea95ba5c14d0169931d7b2dd88196bd42d172adad0d95"
total_cost
0.0167925
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
0.0000%
Matches word count
n/a
neededClean
false
words
676
99.1459%
Dialogue to Total Word Ratio
Ratio: 70.41%, Deviation: 0.41%
neededClean
false
wordsTotal
676
wordsDialogue
476
49.5729%