NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-06-01KK1HHTJ3PH0WJF825HJWKKDA
Run Details
Model:
GPT-5.4
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
30.7s
Cost:
$0.014910
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
1.43%
Dialogue to Total Word Ratio
(details)
0.7139%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
“Molly, tell me again why we’re robbing a six-year-old,” Bill whispered, flattening himself behind a hedge shaped like a dolphin. “We are not robbing a six-year-old,” Molly said. “We are intercepting a concentrated bundle of consumer goods temporarily gathered by poor parental judgment.” “That sounds like robbing a six-year-old in a thesaurus.” “It sounds classier, which is important. Crime without diction is just panic.” Bill peered over the hedge. “There are balloons. I mistrust balloons. They always look like they know who did it.” “Focus. Garden gate at three o’clock, trampoline at twelve, cake table near the conservatory. Presents are stacked under that banner.” “The one that says HAPPY BIRTHDAY, OLLIE! in letters the size of guilt?” “The very one.” Bill adjusted his black gloves. “I still think black is a little dramatic for daylight theft.” “You said you wanted to feel professional.” “I wanted to feel employed. This feels more like being hunted by clowns.” From the garden came a shriek of children and the thump of tiny feet. Molly checked her watch. “Entertainer’s due in two minutes. Once the magician starts, all eyes front.” “A magician at a children’s party,” Bill muttered. “That’s like bringing a violinist to a knife fight.” “It’s ideal. Smoke, applause, confusion. We slip in, take the presents, slip out.” “In pillowcases.” “In pillowcases,” Molly confirmed. “Classic, unthreatening, washable.” Bill looked offended. “I had a duffel bag.” “You had a duffel bag with SKULL DIVISION printed on it.” “It was discounted.” A car horn beeped outside. Molly smiled. “Showtime.” A man in a purple tailcoat stumbled through the gate carrying a case of props. Children swarmed him. Bill blinked. “He’s tiny.” “He’s efficient.” “He’s being kicked by a pirate.” “Use the distraction.” They ducked through the side gate and crouched behind a table of paper cups. A bowl of fluorescent punch trembled beside Bill’s elbow. “Why is the drink blue?” he whispered. “Because childhood is a dare.” A little girl raced past wearing fairy wings. Bill froze. “Did she see me?” “She’s six and covered in glitter. She can’t even see herself.” The magician boomed, “Who wants to see a rabbit?” Twenty children screamed as if offered cash. “Now,” Molly hissed. They scuttled to the present pile: glossy boxes, giant bags, parcels wrapped in dinosaurs and rockets. Bill stared. “This is more loot than I’ve seen at Christmas.” “Parents compete. It’s ugly and useful. Open the sack.” He held up a pillowcase. “This one has ducks on it.” “So?” “I feel like I’m burgling on a sleepover.” “Bill.” “Right. Sorry. Duck sack ready.” Molly scooped presents with brisk efficiency. “Board game. Teddy. Suspiciously expensive plastic truck. Why would any child need a truck with headlights?” “So they can commit tiny crimes at night?” “Keep packing.” From the lawn the magician shouted, “For my next trick, I need a brave volunteer!” Bill winced. “If he picks me, I’m dead.” “He won’t pick you. You radiate substitute teacher.” A boy’s voice rang out nearby. “Mum says never trust a man with a cape!” “Smart child,” Bill said. “Less commentary, more theft.” They stuffed the final parcel away just as a shadow fell across them. Both turned. A woman in an apron stared, holding a tray of sausage rolls. Bill smiled weakly. “Catering?” Molly stepped forward. “Gift relocation.” The woman frowned. “What?” Thinking fast, Bill said, “It’s a party game. Teaches resource management.” A beat passed. Then, from the lawn, the magician yelled, “Madam, is this your Volvo? It appears to have produced a dove!” The woman gasped and ran. Bill exhaled. “I love that tiny man.” “Move.” They hurried for the gate, pillowcases bulging. Behind them children cheered, a rabbit escaped across the grass, and someone began crying in the pure, devastating way only the very young can cry. Bill glanced back. “Do you think Ollie will notice?” Molly shifted the sacks. “Bill, at that age they cry if a sandwich is triangular.” “That is true.” “At least this way,” she said, pushing open the gate, “he learns an important lesson.” “What lesson?” She grinned. “Never leave valuables under a banner.”
created_at
"2026-03-06T12:21:55.914Z"
3/6/2026, 12:21:55 PM
model
"openai/gpt-5.4-20260305"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
366
moderation_latency
(null)
generation_time
30697
tokens_prompt
77
tokens_completion
1267
native_tokens_prompt
66
native_tokens_completion
983
native_tokens_completion_images
0
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.01491
router
(null)
provider_responses
0
endpoint_id
"9ff5625c-403f-4d7f-b895-58ac7295062c"
id
"resp_0e1387d6d4d06e3d0169aac6e3f83481939e856357fc2d6a80"
is_byok
false
latency
153
model_permaslug
"openai/gpt-5.4-20260305"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1772799715-EWHXyankv9kzgUO3cLxH"
upstream_id
"resp_0e1387d6d4d06e3d0169aac6e3f83481939e856357fc2d6a80"
total_cost
0.01491
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
685
1.43%
Dialogue to Total Word Ratio
Ratio: 60.78%, Deviation: 9.22%
neededClean
false
wordsTotal
691
wordsDialogue
420
0.7139%