NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-24-01KQ0N8P98P37F4CZTZRC5T42N
Run Details
Model:
GPT-5.5 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
27.8s
Cost:
$0.035460
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
63.18%
Dialogue to Total Word Ratio
(details)
31.5914%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the crayon-drawn floor plan across the café table and pinned one corner with a sugar shaker. Bill leaned in, squinting at the wobbling rectangle labeled BIG HOUSE!!! in purple. “Why does the chimney have a smiley face?” Bill asked. “That’s not a chimney. That’s the magician.” “The magician is part of the architecture?” “He’s tall, stationary, and wearing bricks on his vest. Close enough.” Bill tapped a cluster of stars. “And these are the presents?” “No, those are crackers. The presents are here, under the table marked TREASURE MOUNTAIN.” “It says TREASURE MOO-TIN.” “The informant is six, Bill. His penmanship is developing.” Their informant, a small boy in a dinosaur jumper, sat two tables away, paid in biscuits and secrecy. He waved. Molly waved back solemnly. “I still say this is bad business,” Bill said. “Stealing from a bank, that’s traditional. Stealing from a museum, classy. Stealing from children? That’s how you get haunted by sticky fingerprints.” “We are not stealing from children. We are reallocating unearned assets.” “They’re birthday presents.” “Exactly. Unearned.” Bill frowned. “What’s the objective?” “All presents removed before cake. Clean exit through the back gate. No witnesses except possibly a rabbit in a waistcoat.” “That’s the entertainer.” “Then definitely no witnesses. Entertainers never notice anything. They’re thinking about invoice dates.” Bill pointed at the garden. “What about the bouncy castle?” “A distraction.” “For them or for us?” “For everyone. Nobody can resist a bouncy castle.” “You bounced last time and bit your own knee.” “A temporary operational setback.” Bill sat back. “Molly, have you considered the presents may contain socks, plastic recorders, and books about sharing?” “That’s why we steal all of them. Volume defeats disappointment.” “What’s our disguise?” Molly opened a tote bag and pulled out two paper hats shaped like cheerful bees. “No.” “Yes.” “I refuse to be a bee burglar.” “You’re not a bee burglar. You’re Buzzington the Party Helper.” “Buzzington has priors.” “Buzzington has charm.” Bill placed the hat on his head. The antennae drooped over his eyes. “I look like a wasp that’s had bad news.” “You look approachable.” “I look arrestable.” Molly ignored him and checked her watch. “Party begins at two. Arrival window: two-oh-seven, when parents are confused, children are sugared, and everyone assumes any adult holding napkins belongs there.” “What do we hold?” “Napkins.” “That’s your entire infiltration plan?” “It’s elegant.” “It’s stationery.” “Napkins are not stationery.” “They’re paper.” “So are banknotes, and you respect those.” Bill sighed. “And if someone asks who invited us?” “We say, ‘Deborah.’ There is always a Deborah.” “What if there isn’t?” “Then we look wounded and say, ‘She’s going by Debbie now?’” Bill considered this. “That is disturbingly plausible.” The café door jingled. A woman entered carrying a towering cake shaped like a pirate ship, complete with frosting cannons and a chocolate plank. Molly’s eyes widened. “Target upgrade,” she whispered. “No,” Bill said immediately. “That cake has rigging.” “No.” “It has edible cannons.” “Molly.” “It’s practically begging to be liberated.” Bill leaned close. “Listen to me. Presents are one thing. Cake is sacred ground. You cross cake, there’s no coming back.” Molly folded the map with the gravity of a general postponing war. “Fine. Presents only.” “Good.” “And possibly party bags.” “No party bags.” “One party bag.” “No.” “For research.” Bill removed the bee hat and pushed it toward her. “New plan. We go in, we look around, and if one child cries, we abandon.” “One child will definitely cry. It’s a children’s party.” “Then we abandon immediately.” Molly glared. “You’ve gone soft.” “I’ve gone sensible. There’s a difference.” Across the café, the dinosaur informant waddled over and placed a biscuit crumb on the map. “What’s this?” Molly asked. “Trap,” he said. Bill blinked. “The children know?” The boy nodded. “We got water balloons.” Molly looked at Bill. Bill looked at the bee hats. “Strategic retreat?” he said. Molly snatched up the sugar shaker. “Strategic brunch.”
created_at
"2026-04-24T21:10:23.793Z"
4/24/2026, 9:10:23 PM
model
"openai/gpt-5.5-20260423"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
6102
moderation_latency
(null)
generation_time
27717
tokens_prompt
77
tokens_completion
1320
native_tokens_prompt
66
native_tokens_completion
1171
native_tokens_completion_images
0
native_tokens_reasoning
217
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.03546
router
(null)
provider_responses
0
endpoint_id
"58e5b336-423e-430b-a2ab-8bc353f0c51b"
id
"resp_0b5a5640007f7ae20169ebdc3fe4188197a48aec8c849228fe"
is_byok
false
latency
316
model_permaslug
"openai/gpt-5.5-20260423"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777065023-iSpeg72BWbXYP7SeHaHj"
session_id
(null)
api_type
"completions"
id
"gen-1777065023-1xWIBolAmpkTfKbluwi8"
upstream_id
"resp_0b5a5640007f7ae20169ebdc3fe4188197a48aec8c849228fe"
total_cost
0.03546
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
656
63.18%
Dialogue to Total Word Ratio
Ratio: 66.97%, Deviation: 3.03%
neededClean
false
wordsTotal
660
wordsDialogue
442
31.5914%