NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-16-01KHKB272D0JEC2FSA0Z07PE1F
Run Details
Model:
GPT-5.2
Test:
Dialogue tags
Scenario:
Write 200 words with 10% dialogue
Duration:
24.0s
Cost:
$0.020401
Provider:
OpenAI
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 10% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly crouched behind the inflatable unicorn, watching the backyard fill with shrieking eight‑year‑olds and responsible-looking parents clutching paper plates. Bill adjusted his black beanie, which in daylight made him resemble a worried barista. The birthday banner sagged like a tired alibi, and a mountain of wrapped boxes waited on a picnic table, glittering with stickers and small, hopeful bows. "Molly, this is low." Bill’s voice was nearly swallowed by a piñata being beaten like a confession. Molly checked her clipboard—because no one questions a clipboard—and nodded toward the side gate. "Kids never lock their loot." They slid along the hedge, passing a cooler of juice boxes. Bill’s stomach growled treasonously, so Molly fed it a stolen grape, then produced a roll of gift tags and a marker. If they walked out carrying presents labeled “From Grandma,” nobody would dare stop them; grandparents were society’s magic invisibility cloak. "We borrow gifts, then vanish." She scribbled briskly. Bill eyed the cake with strategic longing. "And eat cake?" "Only for courage." The doorbell rang inside. That was their cue: slip in as entertainers, exit as legends, and leave the unicorn to take the blame. Bill made animals; Molly practiced smiling like a teacher.
created_at
"2026-02-16T13:43:53.658101+00:00"
2/16/2026, 1:43:53 PM
model
"openai/gpt-5.2-20251211"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
21592
moderation_latency
(null)
generation_time
23904
tokens_prompt
77
tokens_completion
338
native_tokens_prompt
66
native_tokens_completion
1449
native_tokens_completion_images
0
native_tokens_reasoning
1173
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0204015
router
(null)
provider_responses
0
id
"resp_0e91351e311755400169931f0133e08194baf610d2bc233464"
status
200
is_byok
false
latency
169
endpoint_id
"f00142c2-6a93-49ce-9e36-5593b904ce3b"
provider_name
"OpenAI"
model_permaslug
"openai/gpt-5.2-20251211"
api_type
"completions"
id
"gen-1771249409-cZiJGrkgwpXOrjFUZh8H"
upstream_id
"resp_0e91351e311755400169931f0133e08194baf610d2bc233464"
total_cost
0.0204015
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
99.9190%
Matches word count
n/a
neededClean
false
words
203
99.9951%
Dialogue to Total Word Ratio
Ratio: 10.84%, Deviation: 0.84%
neededClean
false
wordsTotal
203
wordsDialogue
22
99.9571%