NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-24-01KQ0P6HJ3JVH1S4B6KVZ8917M
Run Details
Model:
GPT-5.5 (Reasoning, Low)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
33.5s
Cost:
$0.036300
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
71.60%
Dialogue to Total Word Ratio
(details)
35.8000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly unfolded the crayon-smeared invitation on the café table as if it were a royal decree. Bill leaned over it, squinting past a smear of jam. “Princess Poppy’s seventh birthday,” Molly said. “Saturday. Three o’clock. Community hall. Presents piled on the left, cake on the right.” Bill nodded gravely. “Classic layout. Amateur hour.” “You said that about the butcher’s raffle.” “The butcher’s raffle had a balloon arch, Molly. That changes everything.” Molly tapped the invitation. “We go in as entertainers.” Bill’s face brightened. “Magicians?” “No.” “Clowns?” “Absolutely not. You have the emotional balance of a dropped pudding.” “I’ll have you know my clown work is respected in three counties.” “You made a toddler cry by juggling onions.” “He was moved by the artistry.” Molly sighed and pulled a notebook from her coat. “We need disguises, a distraction, and a getaway.” Bill sat up straighter. “I’ve got it. We arrive as pirates. Kids love pirates. We shout, ‘Avast!’ Everyone cheers. Then, while you lead them in a sea shanty, I liberate the parcels.” “Liberate?” “It sounds kinder than steal.” “We are stealing birthday presents from children, Bill. There is no kind version.” He frowned at the invitation. “Do we know what’s in them?” “Not exactly.” “So it could be socks.” “It could be remote-control cars, dolls, building sets, possibly cash in cards.” “Or socks.” “You’re very focused on socks.” “I’ve been hurt before.” Molly flipped a page. “The birthday girl’s mother is named Denise. She’s hired a face painter, a bouncy castle, and someone called Professor Pickle.” Bill paled. “Professor Pickle?” “You know him?” “Know him? The man once trapped me in a balloon giraffe.” “That was you?” “It was a misunderstanding involving a novelty trumpet and three unpaid invoices.” Molly tapped her pen against her teeth. “Fine. We avoid Professor Pickle. We enter during the chaos, blend in, collect the gift bags, and leave.” “Gift bags too?” “Of course.” Bill looked wounded. “Molly, there are lines.” “You once stole a bronze goose from a pensioners’ garden.” “That goose had attitude.” A waitress passed, giving them a suspicious look. Molly smiled sweetly until she moved away. Bill lowered his voice. “What about security?” “It’s a children’s party.” “Exactly. Children are small, fast, and sticky. Terrifying.” “They’ll be occupied.” “With what?” “Cake.” Bill considered this. “Cake is powerful.” “The plan is simple. We bring an empty laundry hamper marked ‘Donations for the Orphan Hedgehogs.’” “Are there orphan hedgehogs?” “There are now.” “Molly, that’s emotional manipulation.” “That’s marketing.” Bill rubbed his chin. “What if someone asks where the hedgehogs live?” “In the hedge.” “Too obvious.” “In a sanctuary.” “Where?” “Bill, no one interrogates hedgehog charity volunteers at a birthday party.” “You’ve clearly never met Denise.” “You don’t know Denise.” “I know her type. Clipboard. Sensible shoes. Eyes like a hawk that’s disappointed in you.” Molly closed the notebook. “Then we need confidence.” Bill sat back and folded his arms. “I’m not sure about this one.” Molly blinked. “You’re getting moral scruples now?” “I’m getting tactical scruples. Children scream. Parents chase. Balloons pop. Someone always has a whistle.” “You’re scared of a whistle.” “I respect the whistle.” Molly stared out the café window. Across the street, a little boy in a paper crown dropped his toy dinosaur. His father picked it up, dusted it off, and crowned the dinosaur too. The boy laughed like a bell. Bill followed her gaze. “That’s the sort of one who’ll have presents.” “Yes.” “Probably socks, though.” Molly folded the invitation smaller and smaller. “Maybe we don’t steal all the presents.” Bill leaned in. “Define ‘all.’” “Maybe none.” “None is a very small percentage of all.” “It’s just...” Molly tucked the invitation away. “It’s a children’s party.” “You said there was no kind version.” “There isn’t.” Bill exhaled, relieved and disappointed at once. “So what do we do with the pirate costumes?” Molly’s smile returned slowly. “We go anyway.” “To steal?” “To perform.” Bill froze. “Perform?” “You said your clown work is respected in three counties.” “Pirate clown work is a delicate fusion.” “We’ll make balloon swords, sing sea shanties, keep Professor Pickle distracted, and leave with cake.” Bill drummed his fingers. “Legally acquired cake?” “If offered.” “And if not?” “Bill.” “Right. Legally acquired cake.” Molly stood, tossing coins beside the untouched jam. “Come on. We need eye patches.” Bill followed, brightening. “And a parrot.” “No live animals.” “A rubber parrot.” “No trumpets.” “One tiny trumpet.” “No.” He opened the café door for her with a flourish. “Molly?” “Yes?” “If Denise asks about the orphan hedgehogs?” Molly grinned. “Tell her they got adopted.”
created_at
"2026-04-24T21:26:41.999Z"
4/24/2026, 9:26:41 PM
model
"openai/gpt-5.5-20260423"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
5755
moderation_latency
(null)
generation_time
33361
tokens_prompt
77
tokens_completion
1687
native_tokens_prompt
66
native_tokens_completion
1199
native_tokens_completion_images
0
native_tokens_reasoning
66
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0363
router
(null)
provider_responses
0
endpoint_id
"58e5b336-423e-430b-a2ab-8bc353f0c51b"
id
"resp_0f713c206c10c3ad0169ebe01217d4819580a6b4aa8d2cb5ad"
is_byok
false
latency
308
model_permaslug
"openai/gpt-5.5-20260423"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777066002-wQ6LShMwIXrqL51IovV0"
session_id
(null)
api_type
"completions"
id
"gen-1777066002-aiTWdsFiXSWBtNumOzqZ"
upstream_id
"resp_0f713c206c10c3ad0169ebe01217d4819580a6b4aa8d2cb5ad"
total_cost
0.0363
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
764
71.60%
Dialogue to Total Word Ratio
Ratio: 72.58%, Deviation: 2.58%
neededClean
false
wordsTotal
766
wordsDialogue
556
35.8000%