NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-16-01KHK9ZP5CVXEX6P4RJN065Q8P
Run Details
Model:
GPT-5.2
Test:
Dialogue tags
Scenario:
Write 200 words with 90% dialogue
Duration:
58.9s
Cost:
$0.044846
Provider:
OpenAI
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Under the rental clown wig, Molly checked the time. Bill adjusted his cardboard crown. "Bill, seven-year-old party, noon. Noise cover: squeals." "Why are we early?" "So the parents trust us before the sugar riot." "I’m dressed as a princess. I feel criminally moisturized." "Good. Pirates get punched; princesses get fed." "Objective?" "All presents. We sweep the gift pile into this sack labeled 'donations.'" "They’ll notice." "Not if you shout, 'Charity for orphans!' and look damp-eyed." "What about the birthday kid?" "We compliment his handwriting. Kids collapse into pride." "Security?" "One golden retriever named Captain Sprinkles." "I hate dogs." "You hate consequences. Give him a cupcake; he’ll sell us the code." "There’s a code?" "Three knocks on the bounce house zipper." "And entry?" "Through the bounce house, out by the juice, past the glitter station." "Exit route?" "During 'Happy Birthday,' we join the conga line, drift to the door, and vanish." "With twelve boxes?" "Stack them on your skirt. It’s basically a wheeled cart." "Physics disagrees." "Party physics. Cake warps reality." "If a kid opens a box and finds my leg?" "Smile. Say it’s a limited-edition doll accessory, then curtsy and sprint." "Remember, Bill: glitter wipes fingerprints, alibis, and dignity too."
created_at
"2026-02-16T13:25:36.900893+00:00"
2/16/2026, 1:25:36 PM
model
"openai/gpt-5.2-20251211"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
5744
moderation_latency
(null)
generation_time
57999
tokens_prompt
77
tokens_completion
966
native_tokens_prompt
66
native_tokens_completion
3195
native_tokens_completion_images
0
native_tokens_reasoning
2883
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0448455
router
(null)
provider_responses
0
id
"resp_05e4bc52db9e6e1d0169931a95c334819783faf6a30aa167f8"
status
200
is_byok
false
latency
178
endpoint_id
"f00142c2-6a93-49ce-9e36-5593b904ce3b"
provider_name
"OpenAI"
model_permaslug
"openai/gpt-5.2-20251211"
api_type
"completions"
id
"gen-1771248277-QW0wcGMIqxtqqYpA4yLv"
upstream_id
"resp_05e4bc52db9e6e1d0169931a95c334819783faf6a30aa167f8"
total_cost
0.0448455
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
99.7443%
Matches word count
n/a
neededClean
false
words
204
99.0360%
Dialogue to Total Word Ratio
Ratio: 93.14%, Deviation: 3.14%
neededClean
false
wordsTotal
204
wordsDialogue
190
99.3901%