NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-16-01KHK9MGVA40Z429X3165103BJ
Run Details
Model:
GPT-5.2
Test:
Dialogue tags
Scenario:
Write 200 words with 50% dialogue
Duration:
45.8s
Cost:
$0.030635
Provider:
OpenAI
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The van idled beside a row of pastel balloons tied to Mrs. Peabody’s fence. Through the windshield, Molly watched tiny pirates waddle in plastic swords, while Bill adjusted his black gloves like they were formalwear. Their bag lay open, hungry as a duffel can be, beside a clipboard labeled “FUN.” The air smelled of frosting and impending felony. A karaoke machine squealed inside; every off-key note felt like cover. Bill slid the panda mascot head onto his lap, its dead stitched eyes judging them both. Molly clicked a timer, because punctuality was her only virtue. She opened the sliding door. “Bill, this party’s a gold mine of wrapped rectangles,” Molly said. “Children’s rectangles,” Bill whispered. “Still rectangles.” “We slip in as entertainers. I’ll juggle, you’ll mime.” “I can’t mime; my face leaks guilt.” “Wear the panda head. No one suspects pandas.” “And the presents?” “We wait for ‘Happy Birthday.’ Everyone sings, nobody watches hands.” “So I grab the pile?” “Only the good ones: the heavy boxes and anything that hums.” “What if a kid tackles me?” “Give them a sticker and run.” “And if parents chase?” “We vanish into the bouncy castle. It’s basically a foam bank vault, at once.”
created_at
"2026-02-16T13:19:18.340243+00:00"
2/16/2026, 1:19:18 PM
model
"openai/gpt-5.2-20251211"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
33598
moderation_latency
(null)
generation_time
45645
tokens_prompt
77
tokens_completion
790
native_tokens_prompt
66
native_tokens_completion
2180
native_tokens_completion_images
0
native_tokens_reasoning
1896
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0306355
router
(null)
provider_responses
0
id
"resp_0a1a78eb1c22e93d0169931927db088193959f0d0146729645"
status
200
is_byok
false
latency
164
endpoint_id
"f00142c2-6a93-49ce-9e36-5593b904ce3b"
provider_name
"OpenAI"
model_permaslug
"openai/gpt-5.2-20251211"
api_type
"completions"
id
"gen-1771247911-L6ElEpHZ0b3x7Rt3iWST"
upstream_id
"resp_0a1a78eb1c22e93d0169931927db088193959f0d0146729645"
total_cost
0.0306355
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
99.9990%
Matches word count
n/a
neededClean
false
words
201
99.9081%
Dialogue to Total Word Ratio
Ratio: 48.26%, Deviation: 1.74%
neededClean
false
wordsTotal
201
wordsDialogue
97
99.9536%