NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-16-01KHK9TTTEBGSBMHBD0H7AYFEJ
Run Details
Model:
GPT-5 Mini
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
1m 13s
Cost:
$0.010308
Provider:
OpenAI
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Night wrapped the cul de sac in suspiciously cheerful lamplight. Molly crouched behind a plastic flamingo, clutching a roll of duct tape and a mime's mask she had found online and insisted was tactical. Bill tapped his watch like a bad metronome, eyes on the house with streamers. They smelled candles and sugar through cracked windows. This was ridiculous, thrilling, and possibly career ending. They rehearsed whispers theatrically, mostly badly. "Okay Bill, listen, we are professionals of a sort, and tonight we will execute a flawless midnight redistribution of youth-owned goods. We will enter through the window with the broken latch that is always conveniently left ajar because parents are tired, not because they are plotting. You will be the distraction with a squeaky toy and a terrible impression of a grandpa lost at the IKEA. I will move through the living room with velvet gloves and a borrowed Santa sack, accepting glitter as collateral damage. We take only presents, not opinions, we replace ribbons with subtle notes of apology, and we leave a tiny chocolate bar as a ransom comic souvenir. Understood? Also silence is mandatory, except for strategic giggling. And wear shoe covers." They leaned close enough to swap deodorant and survival tips. Bill ran a finger along his pocket watch, then produced a candy cane like a conductor's baton, which made Molly sigh with a mixture of admiration and irritation. They studied the guest list scribbled on a napkin, the names mostly Dora, Ethan, and an enigmatic "Tiny Henry" that made them both grimace. Outside, a dog barked once, twice, then went back to sleeping. Molly checked her phone for the babysitter’s hashtag updates, grinned at a photo of a unicorn cake, and said nothing because that would have been morally complicated. "Right, Molly, tactical points. We will time entry with the cupcake march between games. You will neutralize the elf on the shelf with a persuasive story about how his shelf has been upgraded. I will bag presents as if I am wrapping them for absentee relatives, with recycled paper and a fake signature. If confronted by a toddler, we will both adopt slow blinking and issue soft apologies in whalesong. If a grownup appears, blame me because I have a respectable moustache and a plausible accent. Escape route is the garage through the door with the bicycle helmet hanging like a warning. Whatever happens, keep the glitter contained and the moral panic theatrical. Also retrieve any stray helium balloons for later negotiation. And two cupcakes." They nodded at one another like conspirators in a small absurd empire. Molly adjusted her mime mask, slit a hole for an eye, preferring dramatic optics. Bill wrestled the Santa sack into a duffel and whispered a pep talk to a rubber duck he had named Commander Quacks. The truck of neighborhood laundry rumbled by, oblivious. Behind them a weather vane spun as if pointing at the inevitable pile of wrapping paper and bad decisions they were about to produce.
created_at
"2026-02-16T13:23:12.033854+00:00"
2/16/2026, 1:23:12 PM
model
"openai/gpt-5-mini-2025-08-07"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
2348
moderation_latency
(null)
generation_time
71530
tokens_prompt
77
tokens_completion
2384
native_tokens_prompt
66
native_tokens_completion
5146
native_tokens_completion_images
0
native_tokens_reasoning
4480
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0103085
router
(null)
provider_responses
0
id
"resp_036bcaf9e7872b6a01699319f6ae9081948395cd8fea0fbeb3"
status
200
is_byok
false
latency
122
endpoint_id
"c4f66d01-20b0-4c27-a225-438ea22fda43"
provider_name
"OpenAI"
model_permaslug
"openai/gpt-5-mini-2025-08-07"
api_type
"completions"
id
"gen-1771248118-Cxtl8ZaN9w32mxaFGXjW"
upstream_id
"resp_036bcaf9e7872b6a01699319f6ae9081948395cd8fea0fbeb3"
total_cost
0.0103085
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
99.9990%
Matches word count
n/a
neededClean
false
words
501
98.7627%
Dialogue to Total Word Ratio
Ratio: 50.50%, Deviation: 0.50%
neededClean
false
wordsTotal
501
wordsDialogue
253
99.3809%