NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-16-01KHKAP7APM5DJ8EJH0V1BXDQ4
Run Details
Model:
GPT-5
Test:
Dialogue tags
Scenario:
Write 200 words with 10% dialogue
Duration:
46.2s
Cost:
$0.029933
Provider:
OpenAI
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 10% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
They huddled behind a bouncy castle catalog in a laundromat that smelled like citrus and second chances. Molly tapped a pen against schematics of a suburban backyard, where sprinklers arced like lasers with bad posture. Bill weighed a sack of ribbons as though testing currency. The target was a riot of frosting, helium, and gullible generosity, timed between cake and tantrum. Invitations promised a magician, which in their world meant a smoke screen on retainer. "Masks on, hearts off. Remember: dash, grab, vanish." Bill nodded, counting cupcakes in the flyer the way generals count bridges. Each present was color coded in his notes: dinosaurs for speed, dolls for silence, books for ballast, socks for morale. He had drawn a tiny grappling hook labeled Etiquette. Molly rehearsed smiles in the dryer door, practicing the universal expression for Responsible Aunt. "If they cry, we pivot to clown mode and juggle alibis." He produced two kazoos, the last resort of scoundrels courting respectability. The getaway wagon was a borrowed stroller with squeaky wheels and a fake nap. The map’s X sat beneath a piñata, which felt like professional courtesy. Molly closed the catalog. "Deal." Outside, sprinklers rehearsed. Night bent low, eavesdropping on mischief.
created_at
"2026-02-16T13:37:43.033422+00:00"
2/16/2026, 1:37:43 PM
model
"openai/gpt-5-2025-08-07"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
8821
moderation_latency
(null)
generation_time
45969
tokens_prompt
77
tokens_completion
1144
native_tokens_prompt
66
native_tokens_completion
2985
native_tokens_completion_images
0
native_tokens_reasoning
2688
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0299325
router
(null)
provider_responses
0
id
"resp_0bad18a023ea4d530169931d783cf881938dc6f61130501e75"
status
200
is_byok
false
latency
203
endpoint_id
"7c2f859a-7890-4e8e-b1de-1cd1c0a800b4"
provider_name
"OpenAI"
model_permaslug
"openai/gpt-5-2025-08-07"
api_type
"completions"
id
"gen-1771249016-NLI3EhzOKrA9asdz3SRO"
upstream_id
"resp_0bad18a023ea4d530169931d783cf881938dc6f61130501e75"
total_cost
0.0299325
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
100.0000%
Matches word count
n/a
neededClean
false
words
200
100.0000%
Dialogue to Total Word Ratio
Ratio: 10.00%, Deviation: 0.00%
neededClean
false
wordsTotal
200
wordsDialogue
20
100.0000%