NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-18-01KM09W57QDKCCEZB5PRTXBCYY
Run Details
Model:
GPT-5.4 Mini (Reasoning)
Test:
Dialogue tags
Scenario:
Write 200 words with 50% dialogue
Duration:
14.9s
Cost:
$0.012956
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
100.00%
Matches word count
(details)
0.50%
Dialogue to Total Word Ratio
(details)
50.2492%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Behind the community hall, Molly peered through the rain-streaked window at the children’s party inside. Tiny astronauts and princesses chased balloons while a magician produced endless paper rabbits. Bill adjusted his tie, which was there to make him look respectable from a distance. “You see the gift table?” Molly whispered. “Stacked like a sugary castle.” Bill squinted. “I see it. Also a clown making balloon swords. Why are we stealing from a party that has confetti cannons?” “Because the rich moral lessons of childhood include sticky fingers and unattended loot,” Molly said. “We go in as entertainers.” Bill frowned. “I cannot juggle.” “You can barely breathe in rhythm, Bill. You’ll be the sad uncle.” “That’s cruel. I can do sad.” Molly smiled and slid a miniature ladder from her coat. “In, grab the presents, out before the cake.” Bill nodded. “And if the birthday child asks?” “We’ll tell him the presents went to live on a farm,” Molly said. “Children love farms.” “That’s awful.” “Good. Now smile and carry the teddy bear box. It’s too heavy for gifts, which means it’s probably expensive.” Bill hefted the ladder. “If this works, I’m retiring into snacks.” “Dream bigger, Bill. Tonight we’re professionals.”
created_at
"2026-03-18T11:04:13.317Z"
3/18/2026, 11:04:13 AM
model
"openai/gpt-5.4-mini-20260317"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
14451
moderation_latency
(null)
generation_time
14669
tokens_prompt
77
tokens_completion
688
native_tokens_prompt
66
native_tokens_completion
2868
native_tokens_completion_images
0
native_tokens_reasoning
2588
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0129555
router
(null)
provider_responses
0
endpoint_id
"9ee065b2-3d1c-43bc-bdd7-28af3b148282"
id
"resp_038d5640cb9200080169ba86ad64b08190bba2490d4ff195bc"
is_byok
false
latency
220
model_permaslug
"openai/gpt-5.4-mini-20260317"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773831853-urMYar0rIQwya8s39JMJ"
upstream_id
"resp_038d5640cb9200080169ba86ad64b08190bba2490d4ff195bc"
total_cost
0.0129555
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Matches word count
n/a
neededClean
false
words
200
0.50%
Dialogue to Total Word Ratio
Ratio: 65.17%, Deviation: 15.17%
neededClean
false
wordsTotal
201
wordsDialogue
131
50.2492%