NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-16-01KHKB0RAYDDFYYJBG9NG2XY5J
Run Details
Model:
GPT-5
Test:
Dialogue tags
Scenario:
Write 200 words with 50% dialogue
Duration:
36.2s
Cost:
$0.025392
Provider:
OpenAI
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Behind the bouncy castle, confetti crunching underfoot, Molly and Bill crouched beneath a sagging banner that declared HAPPY SIXTH, JASPER! Sugar hung in the air like static; a piñata unicorn swayed, judgmental and sticky with fingerprints. Bill unfolded a crayon-drawn map he’d “liberated” from the craft table, complete with glitter arrows and a bossy sun. Somewhere inside, a magician rehearsed; doves cooed with the menace of tiny landlords. Party horns wheezed like guilty accordions. They watched the present mountain, a shimmering volcano of dinosaurs, puzzles, and parental optimism, guarded only by a sleepy clown and a punch bowl with secrets. "Molly: Recon complete; the clown blinks every twelve seconds exactly." "Bill: Perfect. We'll glide between blinks like considerate, gift-focused ghosts." "Molly: Disguise options? I can balloon animal myself into credibility." "Bill: I pilfered a volunteer lanyard and an authoritative kazoo." "Molly: Entry route uses cake drift, then present summit basecamp." "Bill: I'll run distraction, threatening avant-garde face-paint interpretations of grandparents." "Molly: Remember, we honor birthday law: no card left unread." "Bill: Absolutely. We steal responsibly; chaos, yes—sentiment, properly archived afterward." "Molly: Contingency plan if magician turns us into instructive rabbits?" "Bill: We escape in the unicorn, plead equine diplomatic immunity."
created_at
"2026-02-16T13:43:18.626848+00:00"
2/16/2026, 1:43:18 PM
model
"openai/gpt-5-2025-08-07"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
6891
moderation_latency
(null)
generation_time
35760
tokens_prompt
77
tokens_completion
987
native_tokens_prompt
66
native_tokens_completion
2531
native_tokens_completion_images
0
native_tokens_reasoning
2176
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0253925
router
(null)
provider_responses
0
id
"resp_0fbeba33e7fb9a580169931ed158148193868d1d494428c225"
status
200
is_byok
false
latency
167
endpoint_id
"7c2f859a-7890-4e8e-b1de-1cd1c0a800b4"
provider_name
"OpenAI"
model_permaslug
"openai/gpt-5-2025-08-07"
api_type
"completions"
id
"gen-1771249361-AHOq5jHXeK97F5kkj5cg"
upstream_id
"resp_0fbeba33e7fb9a580169931ed158148193868d1d494428c225"
total_cost
0.0253925
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
99.3769%
Matches word count
n/a
neededClean
false
words
205
99.9779%
Dialogue to Total Word Ratio
Ratio: 51.22%, Deviation: 1.22%
neededClean
false
wordsTotal
205
wordsDialogue
105
99.6774%