NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-24-01KQ0N4QQN6PZYHXM868ZJ4333
Run Details
Model:
GPT-5.5 (Reasoning, Low)
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
31.6s
Cost:
$0.036750
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.03%
Dialogue to Total Word Ratio
(details)
0.0167%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the crayon map across the bonnet of the van and pinned one corner with a sausage roll. The map showed a church hall, a bouncy castle, and a large X labelled PRESENT MOUNTAIN in purple glitter pen. Bill squinted at it through his sunglasses, though it was raining and nearly midnight. “That is not a map,” he said. “That is a cry for help from a stationery cupboard.” “It’s reconnaissance,” Molly said. “Drawn from memory.” “You went to the party?” “I delivered the magician.” “You kidnapped a magician?” “Borrowed. He was late for another job anyway.” Bill tapped the glittery X. “And these are the presents?” “All forty-three of them. Possibly forty-four if Auntie Linda brings the enormous penguin bag I saw in her boot.” “We’re stealing from children, Molly.” “We are stealing from parents who over-purchase. The children will receive a valuable lesson about disappointment.” “That’s what my father called socks.” Molly rolled the map tighter, crushing the sausage roll. “Focus. Tomorrow, two o’clock, the party begins. Two-fifteen, sugar frenzy. Two-thirty, musical statues. That’s our window.” “Our window is musical statues?” “Yes. Everyone freezes.” Bill stared at her. “Molly, burglars generally prefer when witnesses are not staring directly forward.” “They’ll be staring at the DJ. He has a smoke machine and the confidence of a man who owns three waistcoats.” Bill opened the van’s side door. Inside were two costumes: a giraffe with collapsible neck rods and a fairy princess dress with tactical pockets. “No,” he said. “You haven’t even heard the plan.” “I can smell the plan. It smells like nylon and prison.” “You’re the giraffe.” “I knew it.” “You mingle near the buffet. I enter as Princess Sparkletoes, Official Present Inspector.” “Official Present Inspector?” “Children accept bureaucracy if it comes with stickers.” Bill lifted the giraffe head. Its eyelashes were longer than his fingers. “Why am I the giraffe?” “Because you have height.” “I have anxiety.” “That helps the neck wobble.” Rain ticked on the van roof. From somewhere down the street came the lonely honk of a goose, or possibly a small trumpet being harmed. Bill sighed. “What about security?” “One grandmother. Name: Beryl. Weakness: chair aerobics. Strength: suspicion.” “Beryl will destroy us.” “Not if you distract her with the conga.” “I don’t conga under pressure.” “You conga beautifully under pressure. Remember the bank?” “That was not a conga. That was a panic response.” Molly smiled and produced a clipboard from inside her coat. It was covered in stickers shaped like stars, dinosaurs, and one alarming clown. “Here are the labels,” she said. “We replace the real gift tags with these. Then we move the presents to the van marked as donations for the Children’s Joy Redistribution Initiative.” Bill read a sticker. “To: Storage. Love: Society.” “Very official.” “Society doesn’t sign with a heart.” “It does when I’m secretary.” Bill looked toward the church hall, dark and innocent beneath strings of paper bunting still fluttering from some previous event. A banner in the window read HAPPY BIRTHDAY OLIVER in crooked foam letters. “How old is Oliver?” he asked. “Six.” “Six is a tender age.” “He won’t notice. He’ll be inside the bouncy castle, attempting flight.” Bill rubbed his face. “Molly, perhaps we could steal from somewhere less heartbreaking. Like a yacht club. Or a tax office.” “The tax office has cameras.” “So do parents.” “Parents film vertically. Useless evidence.” Bill chuckled despite himself, then immediately tried to look moral. “What are we even expecting to get? Toy cars? Plastic dinosaurs? A drum kit?” Molly’s eyes shone. “Exactly. The drum kit alone has resale value if we include earplugs.” “And if we get caught?” “We improvise.” “Your improvising once got us engaged to a butcher.” “For six minutes, we had a discount on bacon.” Bill set the giraffe head on the bonnet. It stared at him with plush accusation. “I’m not saying yes,” he said. “You put on the hooves.” “I was checking the fit.” “You asked whether the tail was detachable.” “For safety.” Molly folded the map and tucked it inside the princess dress pocket. “Fine. We can call it off. Let Oliver keep his mountain of gifts. Let capitalism win. Let Beryl sit there, undefeated, guarding wrapping paper like a dragon with peppermint creams.” Bill was quiet. Molly waited. “Does the giraffe get a name?” he asked. “Gerald.” “Gerald wouldn’t conga.” “Gerald absolutely would.” Bill picked up the head again. “Gerald wants a better cut.” “Gerald gets twenty percent.” “Thirty, and Gerald keeps any dinosaur stickers.” “Done.” They shook hands solemnly beneath the dripping streetlamp. The sausage roll slid off the bonnet and landed in a puddle with a tragic little plop. Bill glanced at it. “Bad omen.” Molly opened the van door. “Nonsense. Tomorrow, we make history.” “Tomorrow,” Bill said, climbing in with the giraffe, “we are beaten to death by a grandmother named Beryl.” Molly started the engine and grinned. “Only if she catches us dancing.”
created_at
"2026-04-24T21:08:14.207Z"
4/24/2026, 9:08:14 PM
model
"openai/gpt-5.5-20260423"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
4534
moderation_latency
(null)
generation_time
31512
tokens_prompt
77
tokens_completion
1648
native_tokens_prompt
66
native_tokens_completion
1214
native_tokens_completion_images
0
native_tokens_reasoning
65
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.03675
router
(null)
provider_responses
0
endpoint_id
"58e5b336-423e-430b-a2ab-8bc353f0c51b"
id
"resp_08f4bbcf6b8978140169ebdbbe4bd08195a6850e60b0315624"
is_byok
false
latency
245
model_permaslug
"openai/gpt-5.5-20260423"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777064894-y1fOBvT5xLzWxvKJX5f5"
session_id
(null)
api_type
"completions"
id
"gen-1777064894-triRNDmJ9mcZmoaEEAws"
upstream_id
"resp_08f4bbcf6b8978140169ebdbbe4bd08195a6850e60b0315624"
total_cost
0.03675
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
825
0.03%
Dialogue to Total Word Ratio
Ratio: 62.65%, Deviation: 12.65%
neededClean
false
wordsTotal
830
wordsDialogue
520
0.0167%