NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-16-01KHKA94K7GHD01RTT2ZEAJC6B
Run Details
Model:
GPT-5.2
Test:
Dialogue tags
Scenario:
Write 200 words with 50% dialogue
Duration:
1m 10s
Cost:
$0.046918
Provider:
OpenAI
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly unfolded a hand-drawn map on the hood of their dented van. It was mostly crayon, stolen from a diner, and labeled "PRESENT ZONE" with angry stars. Bill’s flashlight flickered, illuminating a yard full of pastel lanterns and a banner that read HAPPY SIXTH, EMMA! Beyond the fence, parents arranged cupcakes like explosive devices, while a hired princess waved and practiced innocence. Molly checked her watch, then the sack marked "TOTALLY NOT CRIME." Bill tightened his shoelaces and winced. Somewhere, a piñata swung, unaware it was their getaway vehicle. A distant kazoo solo signaled the moment to become terrible heroes. "Molly: Okay, Bill, tonight we hit the unicorn party quietly." "Bill: Kids scream, bark, and my knees pop like maracas." "Molly: Focus, grandpa; the loot is wrapped in paper everywhere." "Bill: We wear noses, blend in, then bag gifts fast." "Molly: No, we wear black; clowns draw attention and lawsuits." "Bill: Fine, but I’m bringing balloons for deniability, okay boss." "Molly: Entry through the castle; security is distracted by sugar." "Bill: Exit through the gift table; I’ll fake a trick." "Molly: Remember, we take every present, even the socks, Bill." "Bill: And if caught, we say we’re the entertainers, right?"
created_at
"2026-02-16T13:30:58.123618+00:00"
2/16/2026, 1:30:58 PM
model
"openai/gpt-5.2-20251211"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
38201
moderation_latency
(null)
generation_time
69917
tokens_prompt
77
tokens_completion
1406
native_tokens_prompt
66
native_tokens_completion
3343
native_tokens_completion_images
0
native_tokens_reasoning
3039
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0469175
router
(null)
provider_responses
0
id
"resp_0652da6b7ca4f1440169931bcb74408196a77ed0a309a33a87"
status
200
is_byok
false
latency
224
endpoint_id
"f00142c2-6a93-49ce-9e36-5593b904ce3b"
provider_name
"OpenAI"
model_permaslug
"openai/gpt-5.2-20251211"
api_type
"completions"
id
"gen-1771248587-wRmNnBxXh9SYn9enOozf"
upstream_id
"resp_0652da6b7ca4f1440169931bcb74408196a77ed0a309a33a87"
total_cost
0.0469175
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
99.9990%
Matches word count
n/a
neededClean
false
words
201
99.7491%
Dialogue to Total Word Ratio
Ratio: 52.24%, Deviation: 2.24%
neededClean
false
wordsTotal
201
wordsDialogue
105
99.8740%