NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-16-01KHKA0TPHKG8GTAE6Y8KHP2WJ
Run Details
Model:
GPT-5.2
Test:
Dialogue tags
Scenario:
Write 200 words with 10% dialogue
Duration:
26.7s
Cost:
$0.020122
Provider:
OpenAI
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 10% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly crouched behind the hydrangeas of Mrs. Donnelly’s backyard, studying the battlefield of pastel balloons and sticky little footprints. A banner screamed HAPPY SEVEN, as if the yard itself were yelling. Bill adjusted his black beanie, which made him look less like a criminal mastermind and more like a nervous chess pawn. Their gear was modest: a canvas sack, two juice boxes for cover, and a roll of wrapping paper to disguise the sack as “extra decorations.” The smell of hot dogs drifted over, weaponizing nostalgia. Somewhere a kid tested a kazoo like an air raid siren. Molly drew a map in dirt with a plastic spoon. Presents sat in a glittering pile beside the picnic table, guarded by a sleeping golden retriever wearing a party hat. Bill swallowed, eyeing the dog’s teeth and the nearby sprinkler control. Bill’s watch beeped; he muted it with panic, pretended it was a hiccup. “We nab the presents, Bill, and leave the clown untouched.” “Fine, but if the piñata swings, I’m running first anyway.” On the count of the birthday song, they’d slip through the side gate, swap the sack for the pile, and vanish before anyone noticed the sudden absence of joy.
created_at
"2026-02-16T13:25:42.692046+00:00"
2/16/2026, 1:25:42 PM
model
"openai/gpt-5.2-20251211"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
5261
moderation_latency
(null)
generation_time
26481
tokens_prompt
77
tokens_completion
1017
native_tokens_prompt
66
native_tokens_completion
1429
native_tokens_completion_images
0
native_tokens_reasoning
1164
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0201215
router
(null)
provider_responses
0
id
"resp_03c92c318a668eed0169931abb32b88195a0f9a039754f5708"
status
200
is_byok
false
latency
420
endpoint_id
"f00142c2-6a93-49ce-9e36-5593b904ce3b"
provider_name
"OpenAI"
model_permaslug
"openai/gpt-5.2-20251211"
api_type
"completions"
id
"gen-1771248315-hYhOjWhDT6HGwM52KkOD"
upstream_id
"resp_03c92c318a668eed0169931abb32b88195a0f9a039754f5708"
total_cost
0.0201215
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
100.0000%
Matches word count
n/a
neededClean
false
words
200
99.9900%
Dialogue to Total Word Ratio
Ratio: 11.00%, Deviation: 1.00%
neededClean
false
wordsTotal
200
wordsDialogue
22
99.9950%