NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-16-01KHKABSGE9ZTX1C8BSEJ9AHG1
Run Details
Model:
GPT-5 Mini
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
59.9s
Cost:
$0.011598
Provider:
OpenAI
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly and Bill sat in the idling van with party balloons ticking against the roof like nervous confetti. Night varnished the cul-de-sac; the house across the street resembled a lit-up gingerbread diorama where an army of children had been briefly victorious. They had planned other capers—museums with velvet ropes, banks with snooty potted plants—but tonight felt absurd even to them, a candy-coated moral slip. They were not cruel; they were ridiculous. Molly checked a crumpled list and pretended to squint at the trashy superhero napkins as though they contained state secrets. Bill hummed tunelessly and practiced a solemn eyebrow raise he called his “gentle villain.” He smelled faintly of salted popcorn and ambition. Both of them could already see the confetti aftermath: sequined toppers, half-eaten cupcakes, the slow-motion betrayal of glitter. It was a job that only made sense on paper, and paper loved ridiculous plans. They exchanged glances, knowingly. Molly: "We go in, grab the presents, leave." Bill: "That's it?" Molly: "That's the gist. Add flair." Bill: "Flair how? Capes? Tiny top hats?" Molly: "Top hats. Also confetti cannons, but tasteful." Bill: "Tasteful confetti? Like the quiet kind?" Molly: "The kind that apologizes afterward." Bill: "We steal gifts and leave cupcakes?" Molly: "Cupcakes are negotiable. We'll take the sparkle, leave crumbs." Bill: "What about the birthday cake?" Molly: "Cake is an emotional hazard. We admire from a respectful distance." Bill: "And the piñata?" Molly: "Piñata's for the kids. We will donate our best analogies." Bill: "Geoffrey?" Molly: "Only on Tuesdays. He brings cake metaphors." Bill: "And the kids?" Molly: "Kids will think we're part of the show. We will be professional confusers." Bill: "Professional confusers? There's a union?" Molly: "No, but there should be. We get membership cards." Bill: "What about a getaway?" Molly: "We ride off on tiny tricycles." Bill: "Tiny tricycles? With streamers?" Molly: "Streamers, bells, maybe a kazoo." Bill: "A kazoo? Really?" Molly: "Imagine the confusion. It will be legendary." Bill: "Legendary confusion. I like the sound of that." Molly: "Also, optional glitter. Cautionary." Bill: "Glitter? You know that's permanent in our hearts?" Molly: "Hearts and eyebrows." Bill: "Final check. Commitment?" Molly: "Commitment. Costume?" Bill: "Clown-mime suit?" Molly: "Clown-mime suit. With sequins." Bill: "Sequins win." Molly: "Remember, we take only what's amusing." Bill: "Amusing how? By twinkle?" Molly: "By twinkle and narrative necessity." Bill: "And cupcakes? final call?" Molly: "Cupcakes are a scene stealer. Take one, leave."
created_at
"2026-02-16T13:32:14.63024+00:00"
2/16/2026, 1:32:14 PM
model
"openai/gpt-5-mini-2025-08-07"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
2006
moderation_latency
(null)
generation_time
59773
tokens_prompt
77
tokens_completion
1635
native_tokens_prompt
66
native_tokens_completion
5791
native_tokens_completion_images
0
native_tokens_reasoning
5056
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0115985
router
(null)
provider_responses
0
id
"resp_0fd95edd90625bd20169931c22648c8195b00f58bb08934cd4"
status
200
is_byok
false
latency
112
endpoint_id
"c4f66d01-20b0-4c27-a225-438ea22fda43"
provider_name
"OpenAI"
model_permaslug
"openai/gpt-5-mini-2025-08-07"
api_type
"completions"
id
"gen-1771248674-vkBZW5dG2YqvLYI9jpn9"
upstream_id
"resp_0fd95edd90625bd20169931c22648c8195b00f58bb08934cd4"
total_cost
0.0115985
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
0.0000%
Matches word count
n/a
neededClean
false
words
411
0.0000%
Dialogue to Total Word Ratio
Ratio: 52.55%, Deviation: 17.45%
neededClean
false
wordsTotal
411
wordsDialogue
216
0.0000%