NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-16-01KHKAZAY7QVFYGECZ19S5Q729
Run Details
Model:
GPT-5.2
Test:
Dialogue tags
Scenario:
Write 200 words with 10% dialogue
Duration:
27.3s
Cost:
$0.021857
Provider:
OpenAI
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 10% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly crouched behind the hedges of Number Seven, where a unicorn piñata guarded the porch like a drunk bouncer. Through the window, shrieks of joy ricocheted off paper streamers, and the smell of vanilla cake drifted out with the menace of nostalgia. Bill checked his toolkit: lockpicks, duct tape, and an emergency party hat for blending in. He looked offended by the glitter stuck to his sleeve, as if fashion had mugged him. They studied the backyard map Molly had sketched on a napkin from a diner. X marks meant “presents,” circles meant “parents,” and the skull-and-crossbones meant “clown.” A hired magician practiced in the driveway, failing to make a balloon dog vanish. Good. Distraction was already on payroll. Molly timed the chant of “Happy Birthday” like it was a bank alarm cycle. Bill rehearsed his innocent face, which resembled a guilty shovel. The plan was simple: slip in during the song, lift the gift mountain, and slide out before anyone noticed the missing mountain. It would have worked in any grown-up crime. “In out and we leave the pony.” “No pony. Just gifts. Also avoid balloons; they squeak like cops all night.” They tightened their shoelaces and rehearsed giggles.
created_at
"2026-02-16T13:42:22.647309+00:00"
2/16/2026, 1:42:22 PM
model
"openai/gpt-5.2-20251211"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
10703
moderation_latency
(null)
generation_time
25985
tokens_prompt
77
tokens_completion
933
native_tokens_prompt
66
native_tokens_completion
1553
native_tokens_completion_images
0
native_tokens_reasoning
1277
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0218575
router
(null)
provider_responses
0
id
"resp_06f6765192d861560169931ea2debc8196a65ce862c3abb246"
status
200
is_byok
false
latency
196
endpoint_id
"f00142c2-6a93-49ce-9e36-5593b904ce3b"
provider_name
"OpenAI"
model_permaslug
"openai/gpt-5.2-20251211"
api_type
"completions"
id
"gen-1771249314-hiXLpdtulS0xPF7Cob3N"
upstream_id
"resp_06f6765192d861560169931ea2debc8196a65ce862c3abb246"
total_cost
0.0218575
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
99.9190%
Matches word count
n/a
neededClean
false
words
203
99.7131%
Dialogue to Total Word Ratio
Ratio: 12.32%, Deviation: 2.32%
neededClean
false
wordsTotal
203
wordsDialogue
25
99.8160%