NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-16-01KHKA31C42WRHV6W0K6JM0YW0
Run Details
Model:
GPT-5.2
Test:
Dialogue tags
Scenario:
Write 200 words with 10% dialogue
Duration:
24.8s
Cost:
$0.019800
Provider:
OpenAI
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 10% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly crouched behind the bouncy castle, studying the suburban battlefield: streamers, sugar fumes, and a hired magician unloading doves like contraband. Bill adjusted his ski mask, which had a smiling shark on it because the discount aisle had opinions. The backyard fence was draped with “HAPPY SIXTH, LEO!” in letters tall enough to be read by police helicopters. A table of presents glittered under a canopy, guarded only by a sleeping golden retriever wearing a party hat. Somewhere, a piñata hung like an overconfident witness. "In and out, Bill: grab ribbons, skip cake." "I prefer cake, but fine—presents first, frosting later." "No clowns; they notice." They’d timed it for the musical chairs finale, when every adult became a referee and every child became a lawsuit. Molly produced a map drawn on a napkin from the pizza place, complete with grease stains marking “high security.” Bill packed a duffel and a tiny pair of scissors for bows, insisting it was “professional.” The plan was simple: blend in as entertainers, vanish with the loot, and leave behind only an empty gift table and a suspiciously happy dog. Bill rehearsed balloon animals with gloved hands, creating a sad giraffe that looked guilty already.
created_at
"2026-02-16T13:26:52.635967+00:00"
2/16/2026, 1:26:52 PM
model
"openai/gpt-5.2-20251211"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
15390
moderation_latency
(null)
generation_time
24244
tokens_prompt
77
tokens_completion
709
native_tokens_prompt
66
native_tokens_completion
1406
native_tokens_completion_images
0
native_tokens_reasoning
1132
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0197995
router
(null)
provider_responses
0
id
"resp_0792496b37a3aea60169931b0387a881939434eecc17ed7c5f"
status
200
is_byok
false
latency
134
endpoint_id
"f00142c2-6a93-49ce-9e36-5593b904ce3b"
provider_name
"OpenAI"
model_permaslug
"openai/gpt-5.2-20251211"
api_type
"completions"
id
"gen-1771248387-QjAw2Bk3otdhOqwjXISL"
upstream_id
"resp_0792496b37a3aea60169931b0387a881939434eecc17ed7c5f"
total_cost
0.0197995
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
99.9990%
Matches word count
n/a
neededClean
false
words
201
98.6209%
Dialogue to Total Word Ratio
Ratio: 13.43%, Deviation: 3.43%
neededClean
false
wordsTotal
201
wordsDialogue
27
99.3099%