NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-16-01KHKAJV8ZTNFWXGKHE65KW5HW
Run Details
Model:
GPT-5 Mini
Test:
Dialogue tags
Scenario:
Write 200 words with 10% dialogue
Duration:
1m 3s
Cost:
$0.012351
Provider:
OpenAI
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 10% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly and Bill crouched behind the inflatable dinosaur at the back of the park, scanning a circle of tiny chairs and glittering wrapping. They smelled sugar and gum and the faint iron tang of helium. Bill adjusted his black cap until it looked like a dad's shade. Molly checked her list — not of names, but of irresistible packages: robots, sparkly unicorns, a suspiciously heavy box labeled "For Big Brother." They rehearsed whispers like stage directions: swift, unnoticed, charming. The playground lights blinked; a balloon escaped into the night. Children shrieked with glee and a dog barked approval. Molly tapped Bill's sleeve, eyes bright as party candles. They understood the rules of modern grand larceny: be quick, be ridiculous, never steal the cake (too many witnesses). Their plan had elegance: a diversion involving a rogue piñata, a borrowed kazoo, and synchronized confetti flurries. They practiced the confetti toss and hid pliers in a balloon bouquet. The timing depended on one thing — the moment when attention turned entirely to sugar and song. They smiled like crime scene decorators. Molly: "Ready? On three: cake diversion, kazoo, confetti, then sweep every gift." Bill: "One, two, three messy loud fast glitter exit now."
created_at
"2026-02-16T13:36:09.153662+00:00"
2/16/2026, 1:36:09 PM
model
"openai/gpt-5-mini-2025-08-07"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
8187
moderation_latency
(null)
generation_time
62376
tokens_prompt
77
tokens_completion
1345
native_tokens_prompt
66
native_tokens_completion
6167
native_tokens_completion_images
0
native_tokens_reasoning
5888
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0123505
router
(null)
provider_responses
0
id
"resp_065d79e59bcf02880169931d0993f4819688630e86c26d1b34"
status
200
is_byok
false
latency
139
endpoint_id
"c4f66d01-20b0-4c27-a225-438ea22fda43"
provider_name
"OpenAI"
model_permaslug
"openai/gpt-5-mini-2025-08-07"
api_type
"completions"
id
"gen-1771248905-l8hkBF7sNDfrHF6vr6NP"
upstream_id
"resp_065d79e59bcf02880169931d0993f4819688630e86c26d1b34"
total_cost
0.0123505
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
99.9840%
Matches word count
n/a
neededClean
false
words
198
99.9318%
Dialogue to Total Word Ratio
Ratio: 11.62%, Deviation: 1.62%
neededClean
false
wordsTotal
198
wordsDialogue
23
99.9579%