NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-16-01KHKAQ6Y31CR1JCSHMVTFPKGF
Run Details
Model:
GPT-5 Mini
Test:
Dialogue tags
Scenario:
Write 200 words with 90% dialogue
Duration:
45.6s
Cost:
$0.008726
Provider:
OpenAI
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
"Bill, tonight's the cake, the clowns, and the chaos. We go in, pockets out, mission: every present." "Molly, I have a plan involving a balloon, a distraction, and your ability to look like a bewildered aunt." "Bewildered aunt is my brand. But the toddler with sticky fingers—he terrifies me. What's the exit?" "Window by the cupcakes, sash locked, opens with a cheese grater and the confidence of a magician." "You intimidate with tools, I'll charm the guests. You hum 'Happy Birthday' poorly; it creates a lull." "And the cake?" "Cake is diversionary. We stage a tragic but dignified frosting mishap; crumbs in the hallway buy us time." "Crumbs bring the dog and the chaos. Dogs are unpredictable; we need disguises." "I can wear a tutu and a party hat. You can be my balloon anchoring assistant and support." "Balloon anchoring assistant is a promotion. Do we take the giant stuffed unicorn?" "Only if it folds into a backpack. Otherwise it's emotional baggage." "Do we have an extraction route to the minivan?" "Yes. Driveaway code is 'piñata.' I brought confetti for cover." They tiptoed past paper crowns, the glitter of kleptomania in their eyes, van idling, laughter rehearsed and ready and hungry.
created_at
"2026-02-16T13:38:14.945029+00:00"
2/16/2026, 1:38:14 PM
model
"openai/gpt-5-mini-2025-08-07"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
4137
moderation_latency
(null)
generation_time
45355
tokens_prompt
77
tokens_completion
1280
native_tokens_prompt
66
native_tokens_completion
4355
native_tokens_completion_images
0
native_tokens_reasoning
4032
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0087265
router
(null)
provider_responses
0
id
"resp_02d9b7acd589955d0169931d98985c8194ac3f5058614fef11"
status
200
is_byok
false
latency
149
endpoint_id
"c4f66d01-20b0-4c27-a225-438ea22fda43"
provider_name
"OpenAI"
model_permaslug
"openai/gpt-5-mini-2025-08-07"
api_type
"completions"
id
"gen-1771249048-ZPGfxLYGkVVsGWQILGS7"
upstream_id
"resp_02d9b7acd589955d0169931d98985c8194ac3f5058614fef11"
total_cost
0.0087265
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
99.9840%
Matches word count
n/a
neededClean
false
words
202
100.0000%
Dialogue to Total Word Ratio
Ratio: 90.10%, Deviation: 0.10%
neededClean
false
wordsTotal
202
wordsDialogue
182
99.9920%