NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-09-01KK9PV567BEZGZ4G51VM9PHWR
Run Details
Model:
GPT-5.4 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
33.3s
Cost:
$0.018975
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
42.10%
Dialogue to Total Word Ratio
(details)
21.0486%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
“Bill, if you say ‘Operation Birthday Bonanza’ one more time, I’m leaving.” “It’s a working title, Molly. Every serious enterprise needs a name.” “It’s stealing balloons and gift bags from a six-year-old named Oliver.” “Allegedly named Oliver. Could be an Oscar. Wealthy names often begin with O.” Molly peered through the hedge at the garden beyond. “That’s not how wealth works.” Bill adjusted the tiny paper party hat he’d insisted was a disguise. “You don’t know these circles.” “What circles? The swing-set aristocracy?” “Mock me now, but when we’re rich in board games and novelty socks, you’ll apologize.” “I’m not fencing a beginner’s chemistry set, Bill.” “You always say that before a job.” “No, I specifically say that before jobs involving children, clowns, or face paint. This one has all three.” From the garden came a shriek of delight, followed by the thud of a badly struck piñata. Bill rubbed his hands. “Chaos. Nature’s curtain.” “Listen carefully,” Molly said. “We are not villains. We are opportunists in a morally confusing cardigan.” “I wore the cardigan for authority.” “You wore the cardigan because you thought it made you look like a magician.” “It does.” “It makes you look like a substitute geography teacher who lies about scuba diving.” Bill considered this. “That’s still a man with access.” Molly sighed and unfolded a hand-drawn map. “Fine. The presents are under the gazebo. Cake table here. Parents clustered near the prosecco and those little sausage rolls. Entertainment: one clown of uncertain emotional stability.” “He looked at me once and I felt judged.” “Everyone judges you, Bill. Focus.” “So we go in as what? Caterers? Uncles? Divorced godparents?” “We go in as people who belong there. Smile, carry something, nod sadly at the clown.” “I can nod sadly. I’ve had practice.” Molly produced two gift bags. “Props.” Bill blinked. “We’re bringing presents to steal presents? That feels circular.” “It’s camouflage.” “What’s in them?” “Bricks.” Bill stared at her. “Elegant.” “Thank you.” A little girl in a plastic tiara raced past the hedge, then raced back because she had forgotten to roar. Bill lowered his voice. “Children move like drunk pigeons.” “That’s why timing matters. When the cake comes out, everybody sings. During the second verse, we lift the pile, swap in the brick bags, and stroll off.” “There’s a second verse?” “There is when grandparents attend. I checked the guest list on the invitation board.” “You frighten me when you’re competent.” “You frighten me when you’re breathing through your mouth.” Bill straightened. “One concern.” “Only one?” “What if the presents are rubbish? Educational puzzles. Knitwear. A flute.” “Then we take only the premium items.” “How do you identify premium under wrapping paper?” “You shake them.” “In public?” “With confidence.” He nodded slowly. “Confidence has carried me through many things I didn’t understand.” “Such as buttons.” “Needlessly complicated little discs.” A burst of applause rose from the garden. Someone yelled for candles. Molly folded the map. “This is our moment.” Bill squared his shoulders. “For glory.” “For resale.” “For moderate resale.” They slipped through the side gate. A parent glanced at them. “Smile,” Molly hissed. Bill smiled so hard he looked venomous. The parent frowned. “Are you with the magician?” Bill replied at once, “Emotionally, no.” Molly stepped in. “Late presents for Oliver.” “Oscar,” said the parent. Bill whispered, “Told you. O.” They reached the gazebo. Mountains of bright paper and ribbons waited there like treasure in a sugar-fueled kingdom. “Beautiful,” Bill breathed. “Take the blue one, the dinosaur one, and anything heavy.” “Heavy could be books.” “Heavy could be electronics.” “Heavy could also be rocks from a weird aunt.” “Every family has one.” They worked fast, swapping parcels with brick-filled bags. Then a small voice said, “Why are you stealing my presents?” Molly and Bill froze. A boy in a paper crown stood behind them, holding a juice box like a sheriff’s badge. Bill recovered first. “Counterquestion: how do you feel about profit-sharing?” Molly pinched the bridge of her nose. “Bill.” The boy squinted. “Are you pirates?” Bill grinned. “Land pirates.” “Cool,” said the boy. “Take Chloe’s too. She bit me.” Molly blinked. “That’s… unexpectedly helpful.” From the lawn came the first notes of Happy Birthday. Bill hoisted an armful of gifts. “Second verse, Molly.” She grabbed the rest and smirked. “Nature’s curtain.”
created_at
"2026-03-09T16:28:20.05Z"
3/9/2026, 4:28:20 PM
model
"openai/gpt-5.4-20260305"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
4734
moderation_latency
(null)
generation_time
33177
tokens_prompt
77
tokens_completion
1397
native_tokens_prompt
66
native_tokens_completion
1254
native_tokens_completion_images
0
native_tokens_reasoning
227
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.018975
router
(null)
provider_responses
0
endpoint_id
"9ff5625c-403f-4d7f-b895-58ac7295062c"
id
"resp_00417bebefd3d82a0169aef52423cc8193ae4cbb8aa21493a4"
is_byok
false
latency
178
model_permaslug
"openai/gpt-5.4-20260305"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773073700-XWPZFWZzM68PyQ2Mq84y"
upstream_id
"resp_00417bebefd3d82a0169aef52423cc8193ae4cbb8aa21493a4"
total_cost
0.018975
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
719
42.10%
Dialogue to Total Word Ratio
Ratio: 65.84%, Deviation: 4.16%
neededClean
false
wordsTotal
726
wordsDialogue
478
21.0486%