NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-24-01KQ0P50DRDACS5QZXZ1NGTWGV
Run Details
Model:
GPT-5.5 (Reasoning, Low)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
29.3s
Cost:
$0.034260
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
4.14%
Dialogue to Total Word Ratio
(details)
2.0717%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the crayon-drawn floor plan across the café table and pinned down one corner with a sugar bowl. The paper smelled faintly of grape jelly, which she considered a good omen. Bill leaned over it with the solemn focus of a surgeon, except the only instrument in his hand was a tiny spoon he’d been using to dig marshmallows out of his hot chocolate. “The target,” Molly said, tapping a lopsided square, “is the Henderson twins’ seventh birthday party. Saturday. Three o’clock. Theme: pirates and ponies.” Bill frowned. “That’s too many themes.” “That’s what makes it vulnerable.” He nodded, impressed despite himself. “Classic overextension.” Around them, the café hummed with respectable citizens doing respectable things, none of them suspecting that the two people in the corner booth were plotting the least dignified heist in municipal history. Molly wore oversized sunglasses and a trench coat, which would have been subtle if it hadn’t been seventy-eight degrees and if the coat didn’t still have a dry-cleaning tag flapping from the sleeve. Bill had chosen a false mustache. Unfortunately, it kept migrating toward his left ear whenever he smiled. Molly drew a circle around the backyard. “Presents will be stacked here, on the picnic table, guarded by one grandmother, two exhausted parents, and a magician named The Astonishing Kevin.” Bill shuddered. “Magicians know misdirection. We must respect Kevin.” “We will neutralize him with heckling.” “Ethical heckling or cruel heckling?” “Professional.” Bill scribbled “HECKLE KEVIN” into a notebook already full of alarming phrases like balloon contingency and cupcake smoke screen. He underlined it twice. “The entry point,” Molly continued, “is through the side gate. We arrive disguised as entertainers.” Bill brightened. “I can juggle.” “You can drop three oranges in a row.” “That’s part of the act. Children love danger.” Molly ignored this. “I’ll be a face painter. You’ll be Captain Prancybeard, the pony pirate.” Bill’s expression collapsed. “Why am I always the one with hooves?” “Because you own the costume.” “I own it because you mailed it to me anonymously.” “And yet here we are.” Outside the café window, a little girl skipped past holding a balloon shaped like a dachshund. Bill watched it with the wary suspicion of a man who had once been defeated by latex. Molly slid a second sheet from her bag. This one was typed, laminated, and titled Operation Gift Grab in festive bubble letters. “Timeline. Three-oh-five: arrival. Three-ten: blend in. Three-twenty: initiate distraction.” Bill read ahead. “Three-twenty-two: remove presents. Three-twenty-three: escape through hedge. Three-twenty-four: snacks?” “Morale matters.” “What kind of snacks?” “Whatever we can grab from the buffet.” Bill looked horrified. “Molly, we are thieves, not monsters. We do not steal from a children’s buffet.” “You stole communion wafers once.” “I thought they were complimentary crackers.” Molly sat back, lowering her sunglasses just enough to reveal the gleam in her eye. It was the gleam that had preceded the Great Petting Zoo Coin Jar Incident, the Museum Gift Shop Keychain Caper, and one unforgettable afternoon involving a mayoral ribbon-cutting and a suspiciously heavy pair of ceremonial scissors. “This is our masterpiece,” she said. Bill swallowed. “Stealing presents from seven-year-olds?” “Temporarily relocating presents,” Molly corrected. “We ransom them back for one thing.” His mustache twitched. “Money?” “No. The silver scooter.” Bill gasped so loudly that the barista dropped a muffin. The silver scooter had haunted them for months, ever since a child on Willow Street had zipped past during a getaway, looked Bill dead in the eye, and shouted, “Nice running, slowpoke.” “This is about vengeance,” Bill whispered. “This is about transportation.” “It’s about vengeance with wheels.” Molly folded the plans with military precision. “We get in, we get the presents, we get the scooter, and no one gets hurt.” “What if the children cry?” She hesitated. For one brief second, criminal ambition wrestled with basic humanity. Then she produced a stack of brightly colored cards. “We leave these.” Bill took one. It read: Your presents are safe. Please enjoy this complimentary treasure hunt, courtesy of Captain Prancybeard. He stared at her. “You made activities?” “Seven clues, rhyming couplets, peanut-free stickers at the end.” Bill dabbed his eyes with a napkin. “That’s beautiful.” “It’s operationally sound.” “It’s tender.” “It’s crime.” He tucked the card into his notebook. “Molly, I’m in.” She extended her hand across the table. Bill shook it solemnly, and his mustache finally gave up, fluttering down into the whipped cream of his hot chocolate like a tiny defeated squirrel. Neither of them noticed The Astonishing Kevin at the next table, listening behind a newspaper, smiling as he palmed their car keys.
created_at
"2026-04-24T21:25:51.678Z"
4/24/2026, 9:25:51 PM
model
"openai/gpt-5.5-20260423"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
4281
moderation_latency
(null)
generation_time
29091
tokens_prompt
77
tokens_completion
1521
native_tokens_prompt
66
native_tokens_completion
1131
native_tokens_completion_images
0
native_tokens_reasoning
51
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.03426
router
(null)
provider_responses
0
endpoint_id
"58e5b336-423e-430b-a2ab-8bc353f0c51b"
id
"resp_02956c2b3541119d0169ebdfdfc79c8194a2f35b6a08f7375a"
is_byok
false
latency
423
model_permaslug
"openai/gpt-5.5-20260423"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777065951-WwETrZ3rUzG13F3UW4or"
session_id
(null)
api_type
"completions"
id
"gen-1777065951-wqXfIIpMHydW4FNm2ORg"
upstream_id
"resp_02956c2b3541119d0169ebdfdfc79c8194a2f35b6a08f7375a"
total_cost
0.03426
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
767
4.14%
Dialogue to Total Word Ratio
Ratio: 37.98%, Deviation: 7.98%
neededClean
false
wordsTotal
782
wordsDialogue
297
2.0717%