NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-18-01KM09STCGM6M2585MB6DTRZT0
Run Details
Model:
GPT-5.4 Mini
Test:
Dialogue tags
Scenario:
Write 200 words with 10% dialogue
Duration:
2.1s
Cost:
$0.001503
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
0.01%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0049%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 10% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly crouched behind the bakery van, peering at the rainbow balloon arch wavering over the garden gate. Inside, the children’s party buzzed with shrieks, sugar, and the kind of careless joy that made burglary feel almost rude. Almost. Bill unfolded a crinkled map of the house, though it had been drawn on the back of a takeaway receipt and mostly featured questionable arrows. “Easy,” Bill whispered. “We blend in.” “With what?” Molly asked. “Your face?” He adjusted his fake mustache, which had already surrendered at the corners. “I can be charming.” “You can be presentable if the moon is in the right quadrant.” From the marquee came the sound of a game host shouting, “Pin the tail on the dinosaur!” followed by a crash and delighted applause. Molly smiled despite herself. The presents were stacked in the conservatory, glittering beneath curling ribbons and a banner reading HAPPY BIRTHDAY, MAX. “That’s the mother lode,” she said. “One clean sweep, no tears.” Bill nodded solemnly. “What about the cake?” Molly stared at him. “What? I’m not a monster,” he said. “Just a thief.” She checked the kitchen window. “Fine. We take the gifts, leave the cake, and for once in your miserable career, don’t talk to anyone.” Bill straightened his tie. “Molly, darling, I was born for parties.” “Then try not to be the reason this one ends with police and pony rides.”
created_at
"2026-03-18T11:02:56.67Z"
3/18/2026, 11:02:56 AM
model
"openai/gpt-5.4-mini-20260317"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
401
moderation_latency
(null)
generation_time
1972
tokens_prompt
77
tokens_completion
388
native_tokens_prompt
66
native_tokens_completion
323
native_tokens_completion_images
0
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.001503
router
(null)
provider_responses
0
endpoint_id
"9ee065b2-3d1c-43bc-bdd7-28af3b148282"
id
"resp_0185f3be955daf220169ba8660ba2c8193b6a9d72080bc355a"
is_byok
false
latency
177
model_permaslug
"openai/gpt-5.4-mini-20260317"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773831776-nrAgDotTxXBZkg3xou5w"
upstream_id
"resp_0185f3be955daf220169ba8660ba2c8193b6a9d72080bc355a"
total_cost
0.001503
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
0.01%
Matches word count
n/a
neededClean
false
words
231
0.00%
Dialogue to Total Word Ratio
Ratio: 39.83%, Deviation: 29.83%
neededClean
false
wordsTotal
231
wordsDialogue
92
0.0049%