NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-18-01KM0A88G9Q1DP3FD8M9QTX9J5
Run Details
Model:
GPT-5.4 Nano
Test:
Dialogue tags
Scenario:
Write 200 words with 10% dialogue
Duration:
7.2s
Cost:
$0.000362
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
51.93%
Matches word count
(details)
66.77%
Dialogue to Total Word Ratio
(details)
59.3461%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 10% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly crouched beside the tool cart, whispering to her lock picks like they were shy animals. Bill hovered near the frosting table, pretending to examine the balloons while he scanned the room through a crack in the doorway. From inside, the party thumped with cheerful music and the sharp laughter of children who still believed in fairies and cake. “We are in and out,” Molly mouthed, then checked her timer twice, as if time might sense her intentions. Bill adjusted his cap and offered a grin that looked practiced. “You picked the slowest birthday theme on purpose,” he said. “Dinosaurs are never slow,” Molly replied. She slid a slim card into the side of the doorframe and listened for the tiny click that meant permission, or at least the illusion of it. A security camera pivoted overhead, its red light blinking like a bored eye. Bill raised a finger. “When it blinks, we move. Simple.” “Simple is a lie,” Molly said, but her hand already drifted to her pouch. The camera blinked. The door sighed open. They slipped into the hall where presents stacked like candy-colored treasure chests. Bill reached for a tag, then paused. “What if the kids named them?” he asked. Molly plucked a ribbon free and muttered, “Then we steal the names too.”
created_at
"2026-03-18T11:10:49.879Z"
3/18/2026, 11:10:49 AM
model
"openai/gpt-5.4-nano-20260317"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1152
moderation_latency
(null)
generation_time
7155
tokens_prompt
77
tokens_completion
323
native_tokens_prompt
66
native_tokens_completion
279
native_tokens_completion_images
0
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.00036195
router
(null)
provider_responses
0
endpoint_id
"0c835f2e-c18d-4e8c-b245-e1e3bd08b97f"
id
"resp_0991ce2529a88acc0169ba883a0a348195b5f9ede72039afe8"
is_byok
false
latency
751
model_permaslug
"openai/gpt-5.4-nano-20260317"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773832249-uklAa4bMrGL7iBTbBHa1"
upstream_id
"resp_0991ce2529a88acc0169ba883a0a348195b5f9ede72039afe8"
total_cost
0.00036195
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
51.93%
Matches word count
n/a
neededClean
false
words
216
66.77%
Dialogue to Total Word Ratio
Ratio: 17.97%, Deviation: 7.97%
neededClean
false
wordsTotal
217
wordsDialogue
39
59.3461%