NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-24-01KQ0NXB5WBEE5AGCYB1NHTREX
Run Details
Model:
GPT-5.5
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
29.4s
Cost:
$0.034530
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
3.23%
Dialogue to Total Word Ratio
(details)
1.6160%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the crinkled blueprint across the bonnet of Bill’s van, weighting the corners with a half-eaten sausage roll, a torch, and two suspiciously sticky party hats. Across the street, the community hall blazed with balloons, bunting, and the sort of shrieking laughter that could peel wallpaper. “Remember,” Molly said, tapping the drawing with a gloved finger, “we are not here to scare anyone. We are here to liberate unattended parcels from an overabundance of wrapping paper.” Bill squinted at the hall. “It’s a seven-year-old’s birthday, Molly.” “Exactly. Security will be emotional, not professional.” A small boy in a pirate costume ran past the hall window, chased by three fairies and a dinosaur. Bill shuddered. “Looks pretty professional to me.” Molly adjusted the enormous pink bow she had tied under her chin. Their disguises were meant to be harmless: she was dressed as a volunteer magician, and Bill was wearing a foam sandwich board that read MR. WOBBLES’ BALLOON EMPORIUM. Unfortunately, the board was upside down. “Your sign,” Molly said. Bill looked down. “It’s modern.” “It says ‘MUIROPME NOOLLAllAB S’ELBBOW .RM.’” “Foreign children’s entertainment is very big now.” Molly pinched the bridge of her nose. On the blueprint, she had marked the present table with a red X, the kitchen with a blue circle, and the emergency exits with tiny skulls, because she believed in morale. “We enter through the side door at precisely four fifteen, during musical chairs. Maximum chaos. Minimum adults paying attention.” Bill raised a hand. “This is not school,” Molly said. “What if there aren’t enough chairs?” “That is the point of musical chairs.” “No, I mean for us. Standing about looks suspicious.” Molly considered this. “Fine. We carry balloons. No one questions a person carrying balloons.” “People question me constantly.” “That’s because you answer.” Across the street, a woman emerged carrying a cake shaped like a fire engine. Everyone inside screamed with joy. Molly’s eyes narrowed. “Cake deployment early. That changes the schedule.” Bill brightened. “We could steal the cake instead.” “We are not cake thieves.” “We could become cake thieves. Growth mindset.” “The presents, Bill.” He sighed and produced a notebook labeled Crimes, Probably. “Right. Step one: enter party. Step two: locate gifts. Step three: place gifts in sack. Step four: leave before anyone says thank you.” “Why would they thank us?” “For taking away the clutter.” Molly stared at him. “Parents hate clutter,” Bill said defensively. “I read a pamphlet.” A burst of music thumped from the hall. The children began chanting something rhythmic and alarming. Molly checked her watch. “It’s time. Side door. Calm faces. If challenged, we say we’ve come to do balloon animals.” Bill patted his pockets. “I can do a worm.” “A worm is not an animal impressive enough to justify your outfit.” “It has commitment.” They crossed the street with exaggerated casualness, which made them look like two flamingos trying to act natural. At the side door, Molly reached for the handle. It opened before she touched it. A grandmother with steel-gray curls and the eyes of a retired prison warden stood there holding a clipboard. “You must be the entertainers,” she said. Molly smiled. “Indeed. Mysteries, marvels, and inflatable wildlife.” The grandmother looked Bill up and down. “You’re late.” Bill whispered, “Our alibi has become employment.” “Wonderful,” Molly hissed back through her smile. They were swept into the hall before either could object. Twenty children turned toward them with the collective hunger of tiny judges. The present table stood at the far wall, stacked high with glittering boxes, bags, ribbons, and one teddy bear wearing sunglasses. Molly’s fingers twitched. The grandmother thrust a bag of balloons into Bill’s hands. “Ten minutes of animals, then the magic show, then you help hand out party bags.” Bill leaned toward Molly. “This is worse than prison.” Molly whispered, “Focus. We adapt.” A child in a dragon costume pointed at Bill. “Make me a giraffe!” Bill twisted a balloon. It squeaked in protest and became, undeniably, a worm. “That’s a snake,” Bill announced. “It’s short,” said the dragon. “It’s sitting down.” Molly stepped onto the little stage, forcing a grin so wide it felt medically unwise. “Ladies and gentlemen, prepare to be amazed!” She flourished a handkerchief, knocked over a stack of paper cups, and accidentally produced a biscuit from her sleeve. The children cheered. Molly blinked. Bill made another worm. The children cheered louder. By the time the final song began, the thieves had sticky hands, three new fans, and no presents. The grandmother pressed two party bags into their arms. “Excellent work,” she said. “Same time next year?” Outside, Bill opened his bag. “I got crayons.” Molly looked at hers. “I got raisins.” They stood in silence beside the van. Bill said, “So. Successful heist?” Molly glanced back at the glowing hall and the untouched mountain of presents. “We stole their expectations.” “And two party bags.” “And two party bags,” Molly agreed. “Drive.”
created_at
"2026-04-24T21:21:40.546Z"
4/24/2026, 9:21:40 PM
model
"openai/gpt-5.5-20260423"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
827
moderation_latency
(null)
generation_time
29273
tokens_prompt
77
tokens_completion
1487
native_tokens_prompt
66
native_tokens_completion
1140
native_tokens_completion_images
0
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.03453
router
(null)
provider_responses
0
endpoint_id
"58e5b336-423e-430b-a2ab-8bc353f0c51b"
id
"resp_04569e495d19f0570169ebdee4a07881968f4f1e57fd8bb8e7"
is_byok
false
latency
466
model_permaslug
"openai/gpt-5.5-20260423"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777065700-wsbNlRq1VvjVIXiGmJG7"
session_id
(null)
api_type
"completions"
id
"gen-1777065700-8UxXbQTWkSOkq3wIO32p"
upstream_id
"resp_04569e495d19f0570169ebdee4a07881968f4f1e57fd8bb8e7"
total_cost
0.03453
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
824
3.23%
Dialogue to Total Word Ratio
Ratio: 38.29%, Deviation: 8.29%
neededClean
false
wordsTotal
828
wordsDialogue
317
1.6160%