NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-18-01KM08Q2V62SMHCZVPB7YW1MQ9
Run Details
Model:
GPT-5.4 Mini (Reasoning, Low)
Test:
Dialogue tags
Scenario:
Write 200 words with 10% dialogue
Duration:
9.1s
Cost:
$0.004824
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
100.00%
Matches word count
(details)
20.94%
Dialogue to Total Word Ratio
(details)
60.4693%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 10% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Under the flickering streetlamp, Molly spread a crumpled party flyer across the hood of their getaway van, where a papier-mache clown from the previous job still stared accusingly. Bill adjusted his glove and squinted at the invitation. “Children’s parties are always chaos,” he said. Molly grinned. “Chaos is just commerce with balloons.” Through the banquet hall window, they could see streamers, cupcakes, and a mountain of presents stacked beside a cardboard castle. Bill licked his pencil-thin mustache, then winced at the sight of a singing pirate entertainer tap-dancing on a table. “Who guards gifts with pirates?” Molly asked. “Parents,” Bill replied, “with optimism.” A tiny voice from inside shouted, “Cake attack!” and the room erupted in frosting, applause, and panic. Molly cracked her knuckles. “Perfect,” she said. “On my signal, we mingle.” Bill nodded. “And if anyone asks, we’re the clowns?” “Absolutely not,” Molly said. “Clowns are dishonest by profession.” They slipped masks over their faces and marched toward the side door, carrying a folding dolly for the presents. Inside, the birthday child was laughing so hard that no one noticed two professional thieves politely tidying the gift table into a very large, very suspicious pile. “After you,” whispered Bill.
created_at
"2026-03-18T10:43:58.449Z"
3/18/2026, 10:43:58 AM
model
"openai/gpt-5.4-mini-20260317"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
8518
moderation_latency
(null)
generation_time
8862
tokens_prompt
77
tokens_completion
697
native_tokens_prompt
66
native_tokens_completion
1061
native_tokens_completion_images
0
native_tokens_reasoning
777
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.004824
router
(null)
provider_responses
0
endpoint_id
"9ee065b2-3d1c-43bc-bdd7-28af3b148282"
id
"resp_04d8460088c60a140169ba81ee855081968c8340f0ff15f6a4"
is_byok
false
latency
237
model_permaslug
"openai/gpt-5.4-mini-20260317"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773830638-NGTkr1QxH268P0X0F2z2"
upstream_id
"resp_04d8460088c60a140169ba81ee855081968c8340f0ff15f6a4"
total_cost
0.004824
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Matches word count
n/a
neededClean
false
words
200
20.94%
Dialogue to Total Word Ratio
Ratio: 21.18%, Deviation: 11.18%
neededClean
false
wordsTotal
203
wordsDialogue
43
60.4693%