NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-18-01KM09JXYJ2PQRKH9T6DQA369C
Run Details
Model:
GPT-5.4 Mini (Reasoning, Low)
Test:
Dialogue tags
Scenario:
Write 200 words with 10% dialogue
Duration:
2.9s
Cost:
$0.001791
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
68.10%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
34.0511%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 10% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly crouched behind the bakery dumpster, smoothing her black gloves as rain ticked on the lid like impatient applause. Across the alley, Bill unfolded a crumpled invitation he’d stolen from a recycling bin. “The party starts at four,” he whispered. “Cake at five. Looting window: between laughter bursts.” Molly peered over his shoulder. “You call presents loot now?” “I call everything loot if it’s not nailed down.” The children’s party glowed at the end of the street, balloons bobbing above the porch like cheerful surveillance drones. A clown-shaped cardboard sign promised GAMES, TREATS, and ABSOLUTELY NO RUNNING. Molly smiled. “Perfect. Chaos is camouflage.” Bill checked his watch. “We go in as entertainers.” “I’m not wearing a pirate hat again.” “Then be a magician.” “I vanish stress, not rabbits.” He held up two clip-on mustaches. “Detective disguise?” “Awful.” “Effective?” “Deeply.” From inside the house came shrieks of delight, then the pop of a balloon. Molly straightened, businesslike. “We take the gifts, leave behind the empty wrapping, and exit before anyone notices Uncle Greg singing.” Bill nodded, grinning. “And if a child asks questions?” Molly patted the sack at her hip. “We offer them a balloon animal and confidence.” They shared a solemn glance, then hurried toward the front gate, ready to rob the room politely.
created_at
"2026-03-18T10:59:10.945Z"
3/18/2026, 10:59:10 AM
model
"openai/gpt-5.4-mini-20260317"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
2419
moderation_latency
(null)
generation_time
2710
tokens_prompt
77
tokens_completion
507
native_tokens_prompt
66
native_tokens_completion
387
native_tokens_completion_images
0
native_tokens_reasoning
77
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.001791
router
(null)
provider_responses
0
endpoint_id
"9ee065b2-3d1c-43bc-bdd7-28af3b148282"
id
"resp_01a50bd5fd8a7c7e0169ba857f01308193a1d3ed037e824aa3"
is_byok
false
latency
215
model_permaslug
"openai/gpt-5.4-mini-20260317"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773831550-tOCVS3UR4CnFXlCfhOHb"
upstream_id
"resp_01a50bd5fd8a7c7e0169ba857f01308193a1d3ed037e824aa3"
total_cost
0.001791
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
68.10%
Matches word count
n/a
neededClean
false
words
214
0.00%
Dialogue to Total Word Ratio
Ratio: 40.74%, Deviation: 30.74%
neededClean
false
wordsTotal
216
wordsDialogue
88
34.0511%