NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-24-01KQ0NAM0VKTEV4QGXQYA6DGPZ
Run Details
Model:
GPT-5.5 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
35.6s
Cost:
$0.043140
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
39.67%
Dialogue to Total Word Ratio
(details)
19.8354%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread a napkin map across the café table, weighting the corners with two teaspoons, a sugar bowl, and Bill’s elbow. “Move your elbow,” she said. “It’s load-bearing,” Bill said. “It’s in the jellybean quadrant.” “There’s a jellybean quadrant?” “There is if you’d stop defending it with your sleeve. Now, the children’s party begins at two. Presents on the left. Cake on the right. Chaos everywhere.” Bill peered down. “Why is there a dragon?” “That’s the magician.” “You’ve drawn him breathing fire.” “He overcharges.” Bill nodded gravely. “Then he deserves justice.” “We’re not there for justice. We’re there for presents.” “Right. Tiny bicycles. Stuffed giraffes. Puzzle boxes with all the pieces still inside.” “Exactly.” Bill tapped the napkin. “And how do we enter?” Molly smiled. “Disguises.” “No clowns.” “I wasn’t going to say clowns.” “You were thinking clowns.” “I was thinking party entertainers.” “That is clown-adjacent.” “You’re afraid of clowns.” “I respect their unpredictability.” Molly sighed. “Fine. We go as responsible adults.” Bill stared at her. “That’s your least believable disguise yet.” “I own a cardigan.” “You stole a cardigan.” “It suits me.” “It has ducks on it.” “Responsible ducks.” A waiter passed. Molly folded the napkin with exaggerated innocence until he was gone. Bill whispered, “What about the children?” “What about them?” “They’re loud.” “That’s why they’re useful. Nobody notices anything during musical chairs.” “I notice musical chairs. I hate musical chairs. It’s a game of scarcity dressed up with a jaunty tune.” “You can wait outside.” “And miss the presents? Molly, I have professional pride.” “You once got trapped in a revolving door.” “It revolved aggressively.” Molly leaned closer. “We arrive during the singing. Everyone looks at the cake. We look at the presents.” “What if someone asks who we are?” “We say we’re cousins.” “Whose cousins?” “Children have thousands of cousins. No one checks.” Bill frowned. “Children check everything. Last week a six-year-old told me my moustache looked nervous.” “Your moustache is nervous.” “It’s under pressure.” From the street came the muffled pop of a balloon. Bill flinched and knocked over the sugar bowl. Molly gave him a long look. “You’ll need to be braver than that.” “I’m plenty brave. I once ate gas station sushi.” “That was not bravery. That was poor time management.” He picked sugar crystals from the map. “What if the presents are rubbish? Socks. Educational toys. A wooden duck that teaches fractions.” “Then we improvise.” “With what?” “Cake.” “We’re stealing cake now?” “We’re thieves, Bill. We contain multitudes.” He looked through the café window. Across the road, a banner fluttered outside the community hall: HAPPY BIRTHDAY, MAISIE! Beneath it, a line of parents carried wrapped parcels inside. Bill’s expression softened. “That one has dinosaurs on the paper.” “So?” “So maybe there’s a dinosaur inside.” “It’s probably pajamas.” “Dinosaur pajamas,” he said, wounded by the beauty of it. Molly followed his gaze. A little girl in a yellow dress skipped up the steps, clutching a handmade card almost as big as her torso. She had glitter in her hair and absolute faith in the world. Bill cleared his throat. “Molly.” “No.” “I haven’t said anything.” “You’re about to become sentimental.” “I’m not sentimental. I’m ethically inconvenienced.” Molly looked back at the napkin map. The dragon suddenly seemed ridiculous. The jellybean quadrant seemed worse. “She’ll cry,” Bill said. “Children cry for everything.” “She’ll cry because two grown criminals nicked her dinosaur pajamas.” “You don’t know they’re dinosaur pajamas.” “I know in my heart.” Molly rubbed her temples. “We need rent.” “We also need souls. Mine’s secondhand, but it’s still technically operational.” There was a pause. Molly snatched up the napkin and crumpled it. “Fine.” “Fine?” “We don’t steal the presents.” Bill brightened. “Excellent. Moral victory. Shall we steal the magician’s wallet?” “No.” “His hat?” “No.” “His rabbit?” “We are not stealing a rabbit.” “Borrowing?” “Bill.” He held up both hands. “Right. New plan. We attend politely, eat one slice of cake each, and leave.” “We weren’t invited.” Bill grinned. “Then we don’t attend. We stand outside and look mysterious.” Molly rose, tugging her duck cardigan straight. “No. We buy a present.” “With what money?” She reached into her pocket and produced a small silver watch. Bill gasped. “Molly.” “What?” “Is that mine?” “It was yours.” He sighed. “What are we buying?” Molly watched the little girl vanish into the hall. “Something with dinosaurs.” “And pajamas?” “If the universe is kind.” Bill put on his hat. “Molly, today we are not thieves.” “Don’t get used to it.” “Of course not.” She smiled despite herself. He opened the café door with a flourish. “Tomorrow, perhaps, we rob someone awful.” “Tomorrow,” Molly said, “we discuss your definition of awful.” “Magicians,” Bill said. “Bill.” “Fine. Mimes.” “Better.”
created_at
"2026-04-24T21:11:27.016Z"
4/24/2026, 9:11:27 PM
model
"openai/gpt-5.5-20260423"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
8688
moderation_latency
(null)
generation_time
35525
tokens_prompt
77
tokens_completion
1761
native_tokens_prompt
66
native_tokens_completion
1427
native_tokens_completion_images
0
native_tokens_reasoning
266
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.04314
router
(null)
provider_responses
0
endpoint_id
"58e5b336-423e-430b-a2ab-8bc353f0c51b"
id
"resp_04a9814af182de720169ebdc7f1e8881959416b3b2447fa8e3"
is_byok
false
latency
458
model_permaslug
"openai/gpt-5.5-20260423"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777065087-Yh9mBiZhDaZlGA5DjOTw"
session_id
(null)
api_type
"completions"
id
"gen-1777065087-jfwhSjhJll3K8T4JSvak"
upstream_id
"resp_04a9814af182de720169ebdc7f1e8881959416b3b2447fa8e3"
total_cost
0.04314
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
789
39.67%
Dialogue to Total Word Ratio
Ratio: 65.70%, Deviation: 4.30%
neededClean
false
wordsTotal
793
wordsDialogue
521
19.8354%