NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-16-01KHKAGBAG0SNR5W8KBR3FG11A
Run Details
Model:
GPT-5
Test:
Dialogue tags
Scenario:
Write 200 words with 10% dialogue
Duration:
1m 33s
Cost:
$0.042783
Provider:
OpenAI
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 10% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly and Bill crouched behind a pyramid of party favors in the community center corridor, smelling like sugar and ambition. Their blueprint was a napkin scrawled with frosting, arrows wobbling where Molly had taste tested the legend key. A pile of glittery paper crowns served as their compass. Bill, in a blazer speckled with confetti, tapped the ticking piƱata watch on his wrist and tried to look professional about it. From inside, a chorus of tiny voices rehearsed the birthday chant with the fury of a storm. They rehearsed their getaway faces in a compact mirror: polite, apologetic, then ravishingly invisible. Molly allocated decoy snacks, appointing cupcakes to the risky posts and cookies to distract from the presents. Bill practiced lifting a present with pinkie etiquette, as if courting a duchess who wore glitter glue. "Masks or party hats?" said Molly. "Hats. Less suspicious, more festive." said Bill. "And shoes?" said Molly. "Squeaky. We blame balloons." said Bill. "Contingency?" said Molly. "Cake camouflage. Always." said Bill. "Right." said Molly. A balloon dog drifted by like a patrol fish, nose bobbing judgment. Bill saluted it. From the doorway, the cake released a buttercream gleam, the kind that topples kingdoms and resolutions.
created_at
"2026-02-16T13:35:17.620649+00:00"
2/16/2026, 1:35:17 PM
model
"openai/gpt-5-2025-08-07"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
14952
moderation_latency
(null)
generation_time
93250
tokens_prompt
77
tokens_completion
1986
native_tokens_prompt
66
native_tokens_completion
4270
native_tokens_completion_images
0
native_tokens_reasoning
3968
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0427825
router
(null)
provider_responses
0
id
"resp_0ddc41cbc5f861950169931cb7b5e48196b7d28f0aa0d861ed"
status
200
is_byok
false
latency
194
endpoint_id
"7c2f859a-7890-4e8e-b1de-1cd1c0a800b4"
provider_name
"OpenAI"
model_permaslug
"openai/gpt-5-2025-08-07"
api_type
"completions"
id
"gen-1771248823-6JMeUl7R1nhZvPnwKfTf"
upstream_id
"resp_0ddc41cbc5f861950169931cb7b5e48196b7d28f0aa0d861ed"
total_cost
0.0427825
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
100.0000%
Matches word count
n/a
neededClean
false
words
200
100.0000%
Dialogue to Total Word Ratio
Ratio: 10.00%, Deviation: 0.00%
neededClean
false
wordsTotal
200
wordsDialogue
20
100.0000%