NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-16-01KHK9F2KPAHG4YEPCYP2D89MZ
Run Details
Model:
GPT-5.2
Test:
Dialogue tags
Scenario:
Write 200 words with 90% dialogue
Duration:
25.1s
Cost:
$0.019505
Provider:
OpenAI
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
"Molly, you sure this is a heist?" Bill asked. "Absolutely. Tiny guests, huge loot. Wrapping paper is just camouflage," Molly said. "It's a children's party." "Exactly. Nobody suspects the adults with a duffel bag." "What’s in the bag?" "Snacks. We blend in. Criminals never eat carrot sticks." "I thought we were stealing presents, not hummus." "We steal presents after we earn trust. First: pin the tail on the donkey." "I refuse to pin anything on livestock." "It’s cardboard, Bill. Listen: you distract the clown." "By threatening him?" "By complimenting his balloon animals. He'll cry; clowns love validation." "And you?" "I hit the gift table. Fast hands, slow smile." "Won’t the parents notice gifts disappearing?" "We replace them with identical boxes full of brochures for my cousin’s accounting firm." "That’s evil." "Yes, but fiscally responsible." "What about the birthday kid?" "I offer him a magic trick. I make his new bike vanish." "He’ll scream." "Not if I tell him it’s an invisible bike. Kids adore imaginary things." "And escape?" "Out the bouncy castle’s emergency flap." "That exists?" "I’m a professional, Bill." "Fine. If I die in a ball pit, tell my parole officer I tried." "Also, steal the candles; they're pricey."
created_at
"2026-02-16T13:15:58.999989+00:00"
2/16/2026, 1:15:58 PM
model
"openai/gpt-5.2-20251211"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
22346
moderation_latency
(null)
generation_time
24982
tokens_prompt
77
tokens_completion
318
native_tokens_prompt
66
native_tokens_completion
1385
native_tokens_completion_images
0
native_tokens_reasoning
1075
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0195055
router
(null)
provider_responses
0
id
"resp_0476740eb005905c01699318757a58819482b4195498ac4812"
status
200
is_byok
false
latency
191
endpoint_id
"f00142c2-6a93-49ce-9e36-5593b904ce3b"
provider_name
"OpenAI"
model_permaslug
"openai/gpt-5.2-20251211"
api_type
"completions"
id
"gen-1771247733-rNPRiGmE3dlxrEhpK5w3"
upstream_id
"resp_0476740eb005905c01699318757a58819482b4195498ac4812"
total_cost
0.0195055
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
100.0000%
Matches word count
n/a
neededClean
false
words
200
66.3916%
Dialogue to Total Word Ratio
Ratio: 98.00%, Deviation: 8.00%
neededClean
false
wordsTotal
200
wordsDialogue
196
83.1958%