NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-18-01KM09FPQH47YY1GWHJS95BH21
Run Details
Model:
GPT-5.4 Mini
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
4.4s
Cost:
$0.003393
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
99.62%
Dialogue to Total Word Ratio
(details)
49.8096%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
“Tell me again why we’re robbing a children’s party,” Molly whispered, peering through the hedge at the sugar-frosted house. “Because,” Bill said, tugging at his black gloves, “every rich parent in the suburbs thinks wrapping paper excuses negligence. Presents everywhere. No security. No one suspects a clown with a clipboard.” Molly adjusted the fake balloon animal around her wrist. “I still think the clown suit was too much.” “It’s not a clown suit. It’s ‘festive entertainment contractor.’” “It has red shoes.” “Festive.” The backyard glowed with paper lanterns, helium balloons, and the kind of cheerful music that made burglary feel morally confusing. Children squealed near a table stacked with cupcakes, while adults clustered around a folding table of punch and panic. On the lawn sat a mountain of wrapped gifts, labeled in glitter pen and arranged like a treasure hoard with better manners. Molly’s eyes narrowed. “That one under the silver bow looks expensive.” Bill followed her gaze. “Probably a robot dog. Or an educational chemistry set. Either way, black market gold.” “The whole plan still sounds daft.” Bill gave her a look. “You’re the one who said you wanted a smaller footprint.” “I meant in the getaway vehicle, not in the target selection.” “Too late now. We’ve committed to the bit.” A toddler waddled past, carrying a plastic sword nearly longer than his body. Molly crouched lower behind the hedge as the child paused, frowned at the bushes, then shouted, “Mum! The leaves are suspicious!” Bill hissed, “You see? Even the toddlers are onto us.” “The toddlers are always onto us.” “Not professionally.” Inside, someone began singing an off-key rendition of Happy Birthday. The adults joined in with the strained enthusiasm of hostages at a team-building retreat. Bill checked his watch. “The cake is the signal.” “For what?” “For us to move.” “Why the cake?” “Because when people are eating cake, they’re not watching the gifts.” Molly stared at him. “Did you make that up just now?” “Yes.” “It’s not terrible.” “I know. That’s why I’m in charge.” She snorted despite herself. “And your brilliant method for getting in?” “We arrive through the side gate as entertainment. You’re a balloon artist. I’m the magician.” “You can’t do magic.” “I can disappear awkwardly behind a curtain.” “That’s not magic, Bill. That’s cowardice with stage lighting.” “Semantics.” The singing ended in a brief storm of applause. Parents reached for plates. Children launched into an unpredictable migration pattern across the lawn. Bill straightened. “Now.” Molly grabbed his sleeve. “Wait. What if they make us do balloon animals?” “We are criminals, not monsters.” She let go. “Fine. But if I have to make a poodle, I’m taking it personally.” They slipped through the side gate, heads down, carrying a crate of “party supplies” that was mostly empty space and one very convincing fake invoice. Bill flashed a grin at the woman by the buffet. “Evening,” he said brightly. “We’re from the fun.” The woman looked him up and down. “You certainly are.” Molly muttered, “That was flirtation or suspicion.” “Same thing in this line of work,” Bill said, and with that they merged into the crowd of satin sashes, sticky fingers, and distracted parents, inching toward the gift pile like extremely underqualified magpies.
created_at
"2026-03-18T10:57:25.248Z"
3/18/2026, 10:57:25 AM
model
"openai/gpt-5.4-mini-20260317"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
303
moderation_latency
(null)
generation_time
4112
tokens_prompt
77
tokens_completion
982
native_tokens_prompt
66
native_tokens_completion
743
native_tokens_completion_images
0
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.003393
router
(null)
provider_responses
0
endpoint_id
"9ee065b2-3d1c-43bc-bdd7-28af3b148282"
id
"resp_05d8fe742de959410169ba85154ef48195983e0856c3e89033"
is_byok
false
latency
191
model_permaslug
"openai/gpt-5.4-mini-20260317"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773831445-vdPc2OCzhvvrbXj4FT3Y"
upstream_id
"resp_05d8fe742de959410169ba85154ef48195983e0856c3e89033"
total_cost
0.003393
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
540
99.62%
Dialogue to Total Word Ratio
Ratio: 49.72%, Deviation: 0.28%
neededClean
false
wordsTotal
543
wordsDialogue
270
49.8096%