NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-24-01KQ0Q2N03K11EZ29RJ58ZNNE3
Run Details
Model:
GPT-5.5 (Reasoning, Low)
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
26.7s
Cost:
$0.031890
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
51.72%
Dialogue to Total Word Ratio
(details)
25.8604%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly crouched behind the privet hedge with a pair of opera glasses, watching the bouncy castle rise and fall in the garden like a cheerful lung. Beyond it, a table sagged beneath parcels wrapped in dinosaurs, unicorns, and one suspiciously lumpy package shaped like a trumpet. Bill lay beside her in the grass, wearing a black turtleneck despite the warm afternoon. A paper party hat sat on his head at a doomed angle. “You look ridiculous,” Molly said. “I look festive,” Bill replied. “You look like a burglar who fell into a cracker.” “That’s the disguise. No one suspects a man in a party hat.” “Everyone suspects a man in a black turtleneck at a six-year-old’s birthday.” Bill adjusted the hat. “It’s navy.” “It’s criminal.” They watched as a troop of children shrieked past, pursued by a man in a tiger costume carrying a tray of juice boxes. The tiger tripped over a scooter, spun magnificently, and saved every drink except his dignity. Molly made a note in her little book. “Entertainment is unstable. Good.” Bill squinted. “How many presents?” “Thirty-two visible. Possibly more behind the paddling pool.” “Any guards?” “Three mums, one grandad, a Labrador, and a boy with a recorder.” Bill shuddered. “The recorder is trouble.” “Agreed. We avoid him.” A balloon drifted over the hedge and bobbed between them. It had a smiling dinosaur printed on it. Bill poked it. “Do we really need all the presents?” “That was the client’s request.” “Our client is you.” “I am a demanding client.” “You also paid us with a packet of crisps.” “Luxury crisps.” Bill sighed and rolled onto his back. “Molly, remind me why we’re robbing a children’s party?” She lowered the opera glasses. “Because the invitation said ‘no presents necessary,’ and yet everyone brought presents. The hypocrisy must be punished.” “That sounds less like theft and more like a personal grudge against wrapping paper.” “It’s about principle.” “It’s about you not being invited.” Molly’s ears reddened. “I don’t care about that.” “You bought a gift.” “It was tactical.” “It was a stuffed badger wearing a waistcoat.” “It had pockets.” From the garden came a chorus of “Happy Birthday,” sung with the tonal accuracy of falling saucepans. The birthday child, a small girl in a silver crown, stood before a cake shaped like a castle. The candles flickered bravely in the breeze. Bill softened. “She looks happy.” “That’s how they get you,” Molly said, but her voice had lost its edge. The girl leaned forward, puffed out her cheeks, and failed to blow out a single candle. Everyone cheered anyway. The tiger costume clapped. The Labrador barked. The recorder boy attempted a fanfare and produced the sound of a kettle giving up. Bill covered his ears. “We strike after cake?” Molly looked at the present table. Then at the girl, who was now being handed a knife far too large for cake management and looked thrilled by responsibility. “No,” Molly said. Bill blinked. “No?” “No heist.” “But we have a plan.” “We have a hedge and indigestion.” “I practiced sneaking.” “You fell into a birdbath.” “That was reconnaissance.” Molly closed her notebook with a snap. “New objective. We infiltrate, deliver the badger, eat cake, and leave without committing a felony.” Bill stared at her. “That’s not a heist.” “It’s an emotional heist. We steal their suspicion.” “And perhaps a sausage roll?” “One sausage roll each. Two if the tiger remains compromised.” Bill sat up, brushing leaves from his sleeves. “What about the presents?” “They stay.” “All of them?” “All of them.” He considered this, then nodded solemnly. “Children are poor targets anyway. Sticky hands. Unpredictable morals.” Molly stood and straightened her jacket. From her bag she withdrew the stuffed badger, its waistcoat neatly buttoned, its tiny pockets empty and hopeful. Bill eyed it. “You really did want to go.” “I wanted to inspect the security weaknesses of modern birthday culture.” “Of course.” “And maybe use the bouncy castle.” “That is also reconnaissance.” They emerged from behind the hedge just as the tiger limped toward the gate. Molly smiled brightly. “Hello. We’re with the badger.” The tiger looked at them, at Bill’s party hat, and at the waistcoated animal in Molly’s hands. “Grand,” said the tiger. “Cake’s by the shed. Mind the recorder.”
created_at
"2026-04-24T21:42:03.019Z"
4/24/2026, 9:42:03 PM
model
"openai/gpt-5.5-20260423"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
3253
moderation_latency
(null)
generation_time
26502
tokens_prompt
77
tokens_completion
1295
native_tokens_prompt
66
native_tokens_completion
1052
native_tokens_completion_images
0
native_tokens_reasoning
70
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.03189
router
(null)
provider_responses
0
endpoint_id
"58e5b336-423e-430b-a2ab-8bc353f0c51b"
id
"resp_0cd32661c6b784b70169ebe3ab1dd08195a6f193e7925ed6fa"
is_byok
false
latency
242
model_permaslug
"openai/gpt-5.5-20260423"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777066923-QpeqRTiOGWCua6wVGl8T"
session_id
(null)
api_type
"completions"
id
"gen-1777066923-McMeYLT0OeBv4VPJviiB"
upstream_id
"resp_0cd32661c6b784b70169ebe3ab1dd08195a6f193e7925ed6fa"
total_cost
0.03189
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
714
51.72%
Dialogue to Total Word Ratio
Ratio: 46.37%, Deviation: 3.63%
neededClean
false
wordsTotal
716
wordsDialogue
332
25.8604%