NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-18-01KM07THCVRMS37RGZSQ7AZ8H1
Run Details
Model:
GPT-5.4 Nano (Reasoning)
Test:
Dialogue tags
Scenario:
Write 200 words with 10% dialogue
Duration:
19.5s
Cost:
$0.003687
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
99.98%
Matches word count
(details)
100.00%
Dialogue to Total Word Ratio
(details)
99.9920%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 10% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Moonlight slid over the porch as Molly crouched beside a box of frosting-scented tools. Bill, wearing a paper birthday hat, checked the porch window like it owed him money. “Shh.” Molly hissed, pressing her ear to the door. Inside, kids shrieked and the party playlist thumped, cheerful as a mugger. Bill pointed at the wall cam. “Cameras see balloons, not crooks.” Molly nodded, lifting a roll of decorative streamers. “Then we carry balloons.” The plan: slip past the squeaky sensor, waltz through the living room, and vacuum every present before the candles. A squeaker toy burbled somewhere near the tree. Bill grinned and adjusted the duffel. “Steal presents, leave frosting.” “Three steps. Go.” Molly whispered. They moved—one, two—soft shoes, soft breaths, hard intentions. Bill paused at a stack of wrapped boxes, then muttered, “And run silently.” The door unlatched with a tiny sigh, as if even it wanted extra gifts. Giggling from the hallway echoed while Molly slid inside, scanning for the prize table. Bill followed, brushing aside a giant foam dinosaur. Their gloved hands grabbed gift bags, ribboned boxes, and one suspicious book titled Dragons of Etiquette. Somewhere, a clownified uncle laughed, unaware the party was being professionally robbed.
created_at
"2026-03-18T10:28:23.082Z"
3/18/2026, 10:28:23 AM
model
"openai/gpt-5.4-nano-20260317"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
18983
moderation_latency
(null)
generation_time
19369
tokens_prompt
77
tokens_completion
746
native_tokens_prompt
66
native_tokens_completion
2939
native_tokens_completion_images
0
native_tokens_reasoning
2645
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.00368695
router
(null)
provider_responses
0
endpoint_id
"0c835f2e-c18d-4e8c-b245-e1e3bd08b97f"
id
"resp_02e91209b0f8b2f20169ba7e4722a48197960baf9b1b27dcbc"
is_byok
false
latency
176
model_permaslug
"openai/gpt-5.4-nano-20260317"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773829703-Iy4lnOVgwdcSwaggEuKs"
upstream_id
"resp_02e91209b0f8b2f20169ba7e4722a48197960baf9b1b27dcbc"
total_cost
0.00368695
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
99.98%
Matches word count
n/a
neededClean
false
words
202
100.00%
Dialogue to Total Word Ratio
Ratio: 9.85%, Deviation: 0.15%
neededClean
false
wordsTotal
203
wordsDialogue
20
99.9920%