NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-18-01KM090BA5D3Z776ZD3WBX8JQK
Run Details
Model:
GPT-5.4 Mini (Reasoning, Low)
Test:
Dialogue tags
Scenario:
Write 200 words with 10% dialogue
Duration:
4.0s
Cost:
$0.001481
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
98.71%
Matches word count
(details)
100.00%
Dialogue to Total Word Ratio
(details)
99.3561%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 10% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly flattened a counterfeit mustache over her lip and stared at the balloon-covered townhouse across the street. The children’s party was already in full squeal, a bright kettle of sugar and chaos waiting to be lifted. On the pavement beside her, Bill adjusted a Santa sack sewn from a supermarket curtain and looked like a man about to smuggle an entire holiday. “You sure about this?” he whispered. Molly grinned. “We are professionals.” Inside the house, a clown in a glitter vest was twisting a dog into a balloon poodle while parents applauded with the exhausted gratitude of hostages. The gift table stood in the hallway, guarded only by a vase of suspicious daisies and a stack of napkins shaped like crowns. Molly checked her watch, then the weather, then the dignity of the neighborhood, which was already beyond saving. Bill peered through the gate. “How many presents?” “All of them,” she said. “That’s the beauty of ambition.” A gust of music leaked through the open window: a kazoo version of “Happy Birthday.” Molly slipped on white gloves. Bill nodded, solemn as a priest at a very small funeral. Together they strolled up the path, smiling like relatives with excellent intentions and no morals at all.
created_at
"2026-03-18T10:49:02.034Z"
3/18/2026, 10:49:02 AM
model
"openai/gpt-5.4-mini-20260317"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
3480
moderation_latency
(null)
generation_time
3774
tokens_prompt
77
tokens_completion
441
native_tokens_prompt
66
native_tokens_completion
318
native_tokens_completion_images
0
native_tokens_reasoning
56
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0014805
router
(null)
provider_responses
0
endpoint_id
"9ee065b2-3d1c-43bc-bdd7-28af3b148282"
id
"resp_009e840d10f4ad910169ba831e17108193bd94e3f970bf3f6f"
is_byok
false
latency
180
model_permaslug
"openai/gpt-5.4-mini-20260317"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773830942-Mkqq20tNwULoSuGZnevg"
upstream_id
"resp_009e840d10f4ad910169ba831e17108193bd94e3f970bf3f6f"
total_cost
0.0014805
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
98.71%
Matches word count
n/a
neededClean
false
words
206
100.00%
Dialogue to Total Word Ratio
Ratio: 9.66%, Deviation: 0.34%
neededClean
false
wordsTotal
207
wordsDialogue
20
99.3561%