NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-16-01KHKADHY8QEY325DEQYQTZG8B
Run Details
Model:
GPT-5 Mini
Test:
Dialogue tags
Scenario:
Write 200 words with 10% dialogue
Duration:
46.2s
Cost:
$0.006149
Provider:
OpenAI
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 10% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly crouched beside a cake-splattered table, blueprint napkins trembling in her hands. Bill balanced a taxidermied teddy on his shoulder like a ridiculous mascot. They whispered like librarians on a caffeine shortage, plotting a gentle chaos: hats, balloons, a sleight of hand disguised by party horns and bad singing. Molly traced the guest list with a finger and a grin, naming victims whose parents would be text-blamed for every missing toy. They rehearsed exits by candlelight, practiced alarm-avoidance with the solemnity of choirboys, and tried on goofy accents to confuse anyone who peered into doorways. Outside, the night smelled of sugar and spray paint; inside, a chorus of tiny shrieks occasionally applauded their mood. Between whispered logistics they paused, savoring the ridiculousness of burgling tinsel. Then Molly announced strategy. "We sneak in, snag every present, leave no crumb today." Bill grinned, a crooked moon of mischief. He adjusted his pocket full of streamers and offered one piece of advice. "And remember, glitter is our friend; dogs are unreliable accomplices." They high-fived with mittened hands and melted into the party like two clean-handed ghosts, already tallying loot in alphabetized lists and humming the birthday song off-key. They grinned, triumphant and ridiculous.
created_at
"2026-02-16T13:32:58.744023+00:00"
2/16/2026, 1:32:58 PM
model
"openai/gpt-5-mini-2025-08-07"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
2656
moderation_latency
(null)
generation_time
46092
tokens_prompt
77
tokens_completion
1026
native_tokens_prompt
66
native_tokens_completion
3066
native_tokens_completion_images
0
native_tokens_reasoning
2752
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0061485
router
(null)
provider_responses
0
id
"resp_0e030c053bad24ab0169931c5c2e848197a78acb3b565f74bb"
status
200
is_byok
false
latency
125
endpoint_id
"c4f66d01-20b0-4c27-a225-438ea22fda43"
provider_name
"OpenAI"
model_permaslug
"openai/gpt-5-mini-2025-08-07"
api_type
"completions"
id
"gen-1771248732-s5AmD1NV2mao2cZRid2C"
upstream_id
"resp_0e030c053bad24ab0169931c5c2e848197a78acb3b565f74bb"
total_cost
0.0061485
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
98.7124%
Matches word count
n/a
neededClean
false
words
206
99.9999%
Dialogue to Total Word Ratio
Ratio: 9.71%, Deviation: 0.29%
neededClean
false
wordsTotal
206
wordsDialogue
20
99.3561%