NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-16-01KHKADVD5BNR9EZF3FZDGVX8Q
Run Details
Model:
GPT-5 Mini
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
50.0s
Cost:
$0.008474
Provider:
OpenAI
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly sat on a dusty folding chair under the glow of a single motel lamp while Bill traced a plan on a napkin with the tip of a crooked pencil. They spoke in the soft conspiratorial tones of professionals who had learned to be absurdly calm about criminal nonsense. Outside the window children shrieked and a party horn sounded like an alarm in a different life. They were going to the house on Maple, Molly said, they would be unseen and exceptionally polite. Politeness had always been part of their routine, a costume that fit the sticky fingers of a nursery. Bill nodded and adjusted a ridiculous bow tie that was not part of the outfit but made him feel theatrical. They planned the timing like bakers: a quick window between cake and Santa, chaos in miniature and the perfect cover. Molly checked a list of possible exits, routes disguised by hedges and a neighbor's tall fence, and smiled at the practical poetry of it. It was not only about presents; it was an absurd moral experiment about generosity and who actually read the fine print on invitations. They practiced whispers like a duet and rehearsed the clumsy empathy they would display if caught with a pink teddy. Outside, a dog barked, then silence; inside, the cake timer would buzz and the little humans would roar. Molly produced a tiny camera and winked at the absurd ethics of documenting their own mischief. Bill humored her, tapping his cuffs with solemnity. Molly We go in at eight right now Bill Eight is cake cutting chaos ideal Molly I'll bring polite empathy and a floral napkin Bill I will be the inconspicuous magician with a limp hat and a very sad smile Molly What about the car We need a getaway with a trunk big enough for inflatables Bill My van is camouflaged as a baby gear delivery complete with ribbon and a suspiciously squeaky horn Molly We'll stage a decoy—Balloon boy stumbles we perform a rescue outcome presents in our arms Bill Parents applauding always improves the moral optics which matters to us deeply for reasons I won't explain Molly What about witnesses Cameras are everywhere especially on tiny phones wielded by tyrannical toddlers Bill We will wear party masks not creepy ones tasteful animal faces slightly absurd to redirect suspicion Molly And the gifts We sort by size then by potential re-gift value then by how loudly a robot dog will scream Bill Small AV equipment first then shiny boxed toys avoid anything labeled 'fragile' unless we can feign sentimental mishap Molly I can cry on cue I took an online class called Weeping for Theft Bill Perfect Also you are in charge of sweets Sugar confuses everyone including security guards Molly What if a toddler insists a present is theirs We return a sticker and a heartfelt apology Bill Stickers are the currency of remorse Make it glittery They forgive glitter immediately Molly Then we go home very quietly
created_at
"2026-02-16T13:33:12.581934+00:00"
2/16/2026, 1:33:12 PM
model
"openai/gpt-5-mini-2025-08-07"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
2388
moderation_latency
(null)
generation_time
49694
tokens_prompt
77
tokens_completion
1644
native_tokens_prompt
66
native_tokens_completion
4229
native_tokens_completion_images
0
native_tokens_reasoning
3584
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0084745
router
(null)
provider_responses
0
id
"resp_019a4f7258afc60f0169931c65e54081948e7ca7d3d5e08a72"
status
200
is_byok
false
latency
197
endpoint_id
"c4f66d01-20b0-4c27-a225-438ea22fda43"
provider_name
"OpenAI"
model_permaslug
"openai/gpt-5-mini-2025-08-07"
api_type
"completions"
id
"gen-1771248741-R8xHgo634psSoVmxd02J"
upstream_id
"resp_019a4f7258afc60f0169931c65e54081948e7ca7d3d5e08a72"
total_cost
0.0084745
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
99.9990%
Matches word count
n/a
neededClean
false
words
501
0.0000%
Dialogue to Total Word Ratio
Ratio: 15.17%, Deviation: 34.83%
neededClean
false
wordsTotal
501
wordsDialogue
76
49.9995%