NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-16-01KHK9FX718EQ12QG86GBZY82Y
Run Details
Model:
GPT-5 Mini
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
49.8s
Cost:
$0.010677
Provider:
OpenAI
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly flattened her gloved hand against the van's fogged window, eyes reflecting the party lights like a cat considering a particularly lit-up tree. Bill was riffling through a bag of inappropriate tools — glitter snippers, a folding pink poodle costume, and a tape dispenser that had seen better lives. They had rehearsed this one in the mirror until the mirror started to protest. Outside, a backyard chorus of small voices and helium-rope music made their target feel alarmingly alive. Molly showed Bill the blueprint drawn on a paper plate, an absurd plan with arrows and a doodle of a goat for reasons she would not explain. They both knew theft etiquette: never touch the birthday crown, never spill a juice box, and always leave a single balloon for the grieving aunt. Molly tapped the side of her nose as if activating a brain-burglary switch. She peered toward the back porch where a table sagged under a pyramid of presents like a sleepy paper mountain. Bill chewed a crumpled napkin and squinted, doing the math of ribbons and probable sticky fingers. He was the practical one; Molly was the one who considered whether wrapping paper had rights. A moth bumped a party light like a tiny, apologetic burglar. Molly whispered something that wasn't on the plate; Bill nodded as if struck by a good pun. Molly said "We hit the party at seven when cake sugar makes everyone predictable." She tapped the plate. Bill replied "Predictable isn't the issue—it's the glitter; it latches onto your shoes and your soul." He adjusted his poodle ears. Molly said "We need a distraction. Balloon animal catastrophe, maybe. You can be the rogue poodle." She grinned like someone who enjoyed chaos. Bill said "I'll practice my sad clown face. Do I honk when burglary calls for tragedy?" He made a face in the window. Molly said "No honking. We bring a magician who vanishes but only in the forty-second corner." Her finger hovered over the napkin stash. Bill murmured "Forty-second corner? That's a shame; I only have a routine for the thirty-first." He crumpled the napkin into the shape of a tiny balloon. Molly said "Listen, Bill, use the cupcake fog. Kids will assume it's theatrical—then you nab the pile." She wagged an eyebrow in theatrical approval. Bill asked "Cupcake fog? You mean powdered sugar grenades?" He looked genuinely concerned for the pastries. Molly said "Yes. And for the grand exit: confetti parachutes disguised as napkins." She folded a napkin into a secret map. Bill growled "You always overdo the theatrical. I prefer a clean sweep: bags, masks, and silence." He made a list with a crayon. Molly announced "Silence isn't funny. We need a theme: 'Return of the Presents' — everyone applauds, then we applaud louder." They both pretended to be solemn party guests. Bill whispered "And the glitter, boss?" The moth hit the light again, comically punctuating the plan. They synchronized watches, smiled, and slid into the night like mischievous confetti, softly together.
created_at
"2026-02-16T13:16:51.328348+00:00"
2/16/2026, 1:16:51 PM
model
"openai/gpt-5-mini-2025-08-07"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
13783
moderation_latency
(null)
generation_time
49604
tokens_prompt
77
tokens_completion
1641
native_tokens_prompt
66
native_tokens_completion
5330
native_tokens_completion_images
0
native_tokens_reasoning
4608
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0106765
router
(null)
provider_responses
0
id
"resp_00b4de9cb63fb3170169931890b17081959a9481db7d2851ff"
status
200
is_byok
false
latency
124
endpoint_id
"c4f66d01-20b0-4c27-a225-438ea22fda43"
provider_name
"OpenAI"
model_permaslug
"openai/gpt-5-mini-2025-08-07"
api_type
"completions"
id
"gen-1771247760-BRLmevaiXmy3liE6vvGR"
upstream_id
"resp_00b4de9cb63fb3170169931890b17081959a9481db7d2851ff"
total_cost
0.0106765
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
98.7124%
Matches word count
n/a
neededClean
false
words
506
99.0593%
Dialogue to Total Word Ratio
Ratio: 30.43%, Deviation: 0.43%
neededClean
false
wordsTotal
506
wordsDialogue
154
98.8858%