NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-16-01KHKAAD98HEXGKJ6S5G810YW5
Run Details
Model:
GPT-5 Mini
Test:
Dialogue tags
Scenario:
Write 200 words with 10% dialogue
Duration:
45.5s
Cost:
$0.008660
Provider:
OpenAI
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 10% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly and Bill crouched behind an inflatable unicorn, whispering like botched magicians. They had brochures, a laser pointer, and a notebook full of snack schedules. The children's party sounded upstairs, a perfect chaos of sugar and unguarded gifts. Molly drew a map with crayon landmarks and circled the dessert table with dramatized glee. Bill practiced his nonchalant step, imagining a trench coat flapping over tiny backpacks. They argued about costumes: a clown was obvious, but everyone knew actual clowns were booked. Each plan had an absurd staple — confetti cannons, a practiced sneeze, an emergency balloon animal. They rehearsed timing down to the cupcake icing wobble and synchronized watches using the tick of a party musician. Molly traced routes between pinatas and party hats while Bill balanced a stack of empty wrapping paper tubes like a captain counting cans. They smiled wickedly at the prospect of becoming legends among felons of frivolity. They compared notes on escape routes, practiced bagging gifts into oversized cake boxes, and rehearsed pity smiles for suspicious parents while humming a lullaby that sounded like sirens. "Cake, gifts, and the big gold dragon." "Distract with confetti, I'll palm gifts while you mime the magician and wink."
created_at
"2026-02-16T13:31:15.112746+00:00"
2/16/2026, 1:31:15 PM
model
"openai/gpt-5-mini-2025-08-07"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
6855
moderation_latency
(null)
generation_time
45271
tokens_prompt
77
tokens_completion
1269
native_tokens_prompt
66
native_tokens_completion
4322
native_tokens_completion_images
0
native_tokens_reasoning
4032
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0086605
router
(null)
provider_responses
0
id
"resp_0fb8edc2afe8d5e30169931bf521148194a7eb1d271af09ccd"
status
200
is_byok
false
latency
134
endpoint_id
"c4f66d01-20b0-4c27-a225-438ea22fda43"
provider_name
"OpenAI"
model_permaslug
"openai/gpt-5-mini-2025-08-07"
api_type
"completions"
id
"gen-1771248629-HtBOQFZAEY4zulnc8864"
upstream_id
"resp_0fb8edc2afe8d5e30169931bf521148194a7eb1d271af09ccd"
total_cost
0.0086605
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
99.9990%
Matches word count
n/a
neededClean
false
words
199
100.0000%
Dialogue to Total Word Ratio
Ratio: 10.05%, Deviation: 0.05%
neededClean
false
wordsTotal
199
wordsDialogue
20
99.9995%