NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-18-01KM08AEDZ95258VKYSB9B1KND
Run Details
Model:
GPT-5.4 Nano (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
48.2s
Cost:
$0.009373
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
100.00%
Matches word count
(details)
43.58%
Dialogue to Total Word Ratio
(details)
71.7910%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Inside the dark living room of the rental house, streamers drooped like exhausted vines. From outside came muffled squeals, the kind that announce sugar and small shoes. Molly checked her gloves, then slid down the hallway toward the front door. Bill followed, clutching a duffel bag painted with “GIFTS” in cheerful bubble letters. He tried to look harmless; unfortunately his mustache was too good. “No one will notice,” he whispered, “because kids only notice snacks.” Molly glanced at him. “Kids notice everything. They just call it noticing when it’s cute.” Bill pointed at the window. “We’re here to steal all the presents while they sing, right?” Molly nodded. “While everyone’s busy being patriotic about birthdays.” She tapped her smartwatch. “I timed the moment the candles get lit and the parents start clapping like trained seals.” Bill raised an eyebrow. “You can’t call them seals. They’ll feel targeted.” “Fine,” Molly said. “Trained applauders.” Bill opened the duffel. Inside were plain brown paper rolls, twist ties, and one tiny roll of tape shaped like a swan. “Wrapping materials,” he said. “For our exit plan or their humiliation?” Molly crouched near the door lock. “For blending in. We’ll look like we belong.” Bill leaned closer. “How?” Molly smiled. “You carry the tape swan. I carry the smiley-face badges.” Bill held up a lanyard reading VOLUNTEER. “This badge says volunteer.” Molly corrected him. “It says volunteer, present inspector, and maybe minor sorcery.” Bill swallowed. “Okay. Then what’s the actual theft part?” Molly opened her tool pouch and removed a small suction-cup hook. “We take the gifts that are already wrapped. No dismantling, no noise, no battery theft disasters.” Bill grinned. “So, just the boxes.” “Exactly,” Molly said. “And we leave the smaller ones. We’re not monsters, we’re burglars with ethics.” Bill stared at the street, where the music pulsed. “If security shows up?” Molly shrugged. “Security at a children’s party is usually a dad with a grill lighter.” Bill made a solemn face. “Then we offer him a balloon?” Molly whispered, “We distract him with the kazoo.” Bill winced. “You brought a kazoo?” Molly lifted it from her coat. “It’s decorated. It looks like a party prop.” Bill nodded. “Good. Our cover story: we’re ‘helping’.” Molly pressed her ear to the door. “Listen. They’re about to start the song.” Bill’s voice dropped to a whisper. “When they say ‘make a wish,’ we move.” Molly’s eyes glittered. “One, two, and then—” Bill finished, “—steal, smile, and run before anyone asks where the missing toys went.” She wiped a smear of frosting from the doorframe, like a professional wiping away fingerprints from destiny. Bill adjusted his hoodie so the seams wouldn’t snag the streamer knots. Somewhere inside, a child shrieked that it was time to clap, and a parent’s laugh echoed, warm as a mug of cocoa. Molly slid her hook into the latch gap, feeling the metal yield with a satisfying click right before the song ended. Molly breathed in. “On three.” Bill whispered, “On three.”
created_at
"2026-03-18T10:37:04.331Z"
3/18/2026, 10:37:04 AM
model
"openai/gpt-5.4-nano-20260317"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
44592
moderation_latency
(null)
generation_time
47839
tokens_prompt
77
tokens_completion
2775
native_tokens_prompt
66
native_tokens_completion
7488
native_tokens_completion_images
0
native_tokens_reasoning
6772
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0093732
router
(null)
provider_responses
0
endpoint_id
"0c835f2e-c18d-4e8c-b245-e1e3bd08b97f"
id
"resp_04963d27599523c70169ba805068848197bf9acc4f9a8aecad"
is_byok
false
latency
227
model_permaslug
"openai/gpt-5.4-nano-20260317"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773830224-zpAdlQlpiI0RRHfFzaGo"
upstream_id
"resp_04963d27599523c70169ba805068848197bf9acc4f9a8aecad"
total_cost
0.0093732
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Matches word count
n/a
neededClean
false
words
501
43.58%
Dialogue to Total Word Ratio
Ratio: 45.92%, Deviation: 4.08%
neededClean
false
wordsTotal
503
wordsDialogue
231
71.7910%