NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-24-01KQ0N4TTG27F5DB4BXK88YEJA
Run Details
Model:
GPT-5.5 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
36.2s
Cost:
$0.041550
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
3.05%
Dialogue to Total Word Ratio
(details)
1.5236%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the crayon-drawn map across the ironing board, smoothing it with the seriousness of a general and the sticky palms of someone who had recently eaten three jelly babies. “Behold,” she said, tapping a purple square. “The Grand Ballroom of Sugarplum Manor.” Bill squinted. “That’s a bouncy castle.” “It’s a tactical inflatable.” “It has a smiling giraffe on it.” “All fortresses have morale issues.” Outside, rain ticked against the window of their tiny flat. Inside, their entire criminal arsenal consisted of two party hats, a borrowed clipboard, and a cake box containing no cake because Bill had become nervous and eaten it. Molly pointed to a yellow scribble. “Here will be the present mountain. Dolls, trains, robots, books with flaps. A glittering heap of childhood wealth.” Bill looked uneasy. “It’s for a six-year-old.” “Exactly. Six-year-olds are careless investors.” “She might cry.” “Then we shall leave a note.” “What kind of note?” “A professional one. ‘Dear Madam, your presents have been requisitioned by experts.’” Bill scratched his chin. “Do experts wear party hats?” “They do when blending in.” He picked up the blue cone hat and put it on. The elastic snapped under his chin, making him yelp. Molly ignored him and adjusted her black gloves, which were mittens with pandas on them. “We arrive at precisely pudding time, when everyone is distracted by custard.” “Children aren’t distracted by custard. Children are powered by custard.” “Fine. Cake time.” “You ate the cake.” “Their cake, Bill. Focus.” Bill leaned over the map. “What about parents?” Molly drew a red circle around a lumpy shape. “They gather here, near the teapot, discussing schools and pretending not to want sausage rolls.” “That’s accurate.” “We mingle. We smile. We say things like, ‘My, haven’t they grown,’ because no one can argue with that.” Bill nodded slowly. “Then what?” “Then we locate the presents.” “They’ll be wrapped.” “Bill, we are thieves, not librarians. We don’t need to read them.” “But what if we steal socks?” Molly paused. The possibility hung in the room like a bad smell. “We take all the presents,” she said. “All of them?” “All of them.” “Even the suspiciously soft ones?” “Especially those. They create volume.” Bill lowered his voice. “Molly, suppose there’s a pony.” “In a box?” “It’s a rich child’s party.” “A boxed pony would be a logistical triumph.” He looked at the empty cake box. “We don’t have room for a pony.” “We’ll cross that stable when we come to it.” A gust of wind rattled the window. Somewhere downstairs, a dog barked as if objecting to the entire scheme. Bill removed the party hat. “I’m not sure about this one.” Molly stared at him. “Bill, last week you tried to rob a vending machine by asking it politely.” “And it gave me crisps.” “You paid for those crisps.” “Still worked.” She folded her arms. “Are you going soft?” “No,” he said. “I just remember my sixth birthday. Uncle Dennis gave me a torch shaped like a dinosaur. Best thing I owned.” Molly’s expression wobbled, then stiffened. “Sentiment is how they catch you.” “Who?” “Society.” Bill picked up the clipboard. On it, Molly had written: Presents: all. Escape: splendid. “What if,” he said carefully, “we break in, but instead of stealing presents, we steal attention?” Molly narrowed her eyes. “Explain before I throw a jelly baby at you.” “We become entertainers. You do your card trick.” “I only know one card trick.” “The one where you drop the cards and accuse gravity of cheating?” “It gets applause.” “And I can make balloon animals.” “You made a balloon worm.” “Children enjoy accuracy.” Molly looked at the map, then at the panda mittens, then at Bill’s hopeful face. At last she sighed the sigh of a criminal mastermind surrounded by amateurs and basic decency. “We would still be infiltrating,” she said. “Absolutely.” “And there would be cake.” “Almost certainly.” “And if a present happens to be unattended?” “Molly.” “I’m saying hypothetically.” “No.” She snatched the party hat and jammed it onto her head. “Very well. Operation Present Mountain is postponed.” Bill smiled. “Operation Birthday Worm?” Molly picked up a green balloon. “Bill, if we’re going straight, we are not doing it under that name.” The balloon squeaked in her hands, long, limp, and ridiculous. Molly twisted it once. It popped like a tiny cannon. They stared at the scraps. Bill said, “Dragon?” Molly nodded solemnly. “Rare one.”
created_at
"2026-04-24T21:08:17.367Z"
4/24/2026, 9:08:17 PM
model
"openai/gpt-5.5-20260423"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
11486
moderation_latency
(null)
generation_time
36073
tokens_prompt
77
tokens_completion
1505
native_tokens_prompt
66
native_tokens_completion
1374
native_tokens_completion_images
0
native_tokens_reasoning
342
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.04155
router
(null)
provider_responses
0
endpoint_id
"58e5b336-423e-430b-a2ab-8bc353f0c51b"
id
"resp_0c5a09a5f1e6d6800169ebdbc171288195b7547988cd4dc14e"
is_byok
false
latency
599
model_permaslug
"openai/gpt-5.5-20260423"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777064897-cYhYAe9pFNS8ILWuGV1G"
session_id
(null)
api_type
"completions"
id
"gen-1777064897-SboQL6SsPqckn4Bi77UG"
upstream_id
"resp_0c5a09a5f1e6d6800169ebdbc171288195b7547988cd4dc14e"
total_cost
0.04155
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
737
3.05%
Dialogue to Total Word Ratio
Ratio: 58.36%, Deviation: 8.36%
neededClean
false
wordsTotal
742
wordsDialogue
433
1.5236%