NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-16-01KHK9ARB2SA3YNZEV1K2QPW1F
Run Details
Model:
GPT-5.2
Test:
Dialogue tags
Scenario:
Write 200 words with 10% dialogue
Duration:
25.6s
Cost:
$0.020793
Provider:
OpenAI
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 10% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly crouched behind the bouncy castle, wearing a visor that said FUN POLICE. Bill lay beside her, disguised as a deflated piñata, trying not to crinkle. Ahead, a suburban lawn exploded with streamers and shrieks; somewhere, a dad operated the bubble machine like it was artillery. Between the games and the frosting, the presents sat in a glittery mountain, wrapped in dinosaurs and princesses, practically begging for a second childhood. Molly consulted a clipboard she’d stolen from an actual party planner. She’d circled “Gift Table” three times and added a skull with party hats. Bill sniffed. The smell of pizza made his stomach whistle like a cheap kazoo. "In and out, Bill, cake distraction, then we bag the loot." Bill’s paper-mâché head tilted toward the magician, who was sawing a teddy bear in half. "And if a clown notices, we blend in as balloons." They watched the birthday boy sprint past, crowned and sticky, followed by a gaggle of sugar-fueled guards. Molly adjusted her sticker badge and palmed a roll of tape. Bill loosened his sack. The heist would be adorable. At the whistle for musical chairs, they slipped forward, grinning, and the gift mountain began to shrink quietly tonight.
created_at
"2026-02-16T13:13:37.924555+00:00"
2/16/2026, 1:13:37 PM
model
"openai/gpt-5.2-20251211"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
23021
moderation_latency
(null)
generation_time
25456
tokens_prompt
77
tokens_completion
315
native_tokens_prompt
66
native_tokens_completion
1477
native_tokens_completion_images
0
native_tokens_reasoning
1202
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0207935
router
(null)
provider_responses
0
id
"resp_001e4e0926edaa3701699317e7e3f481949ca196581b029f55"
status
200
is_byok
false
latency
1221
endpoint_id
"f00142c2-6a93-49ce-9e36-5593b904ce3b"
provider_name
"OpenAI"
model_permaslug
"openai/gpt-5.2-20251211"
api_type
"completions"
id
"gen-1771247591-VEtOvRJz21qoHtP7WkUe"
upstream_id
"resp_001e4e0926edaa3701699317e7e3f481949ca196581b029f55"
total_cost
0.0207935
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
99.9840%
Matches word count
n/a
neededClean
false
words
202
99.9631%
Dialogue to Total Word Ratio
Ratio: 11.39%, Deviation: 1.39%
neededClean
false
wordsTotal
202
wordsDialogue
23
99.9735%