NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-24-01KQ0Q0VSXJZ771NSCZAW9SRYT
Run Details
Model:
GPT-5.5 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
28.1s
Cost:
$0.034470
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
46.68%
Dialogue to Total Word Ratio
(details)
23.3390%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly flattened the birthday invitation against the steering wheel and squinted at the glittery unicorn printed across the top. Outside, the community hall thumped with tiny feet, shrieks, and the sort of music that suggested a kazoo had swallowed a marching band. Through the window, Bill could see a mountain of presents on a folding table, each wrapped in paper brighter than a police siren. “That,” Molly said, tapping the invitation, “is our Everest.” Bill adjusted the cardboard dragon head under his arm. “Everest has fewer toddlers.” “Toddlers are just short security guards with jam on their faces.” “I’m not afraid of toddlers,” Bill said. “I’m concerned about the clown.” Across the car park, a clown in yellow trousers was unloading balloon animals from a hatchback, his painted smile fixed in a look of ancient suffering. Molly sniffed. “Professional rivalry?” “He looked at me. Clowns know things.” Molly ignored him and unfolded her plan, which was drawn on the back of a takeaway menu. There were arrows, stars, and one alarming sketch of a rabbit wearing night-vision goggles. She had labeled the present table TREASURE ISLAND. “We enter as entertainers,” she said. “You’re Barry the Birthday Dragon. I’m Princess Muffintop.” “Again, I object to the name.” “You lost naming rights when you glued sequins to your own elbow.” Bill raised his elbow. It sparkled accusingly. Inside the hall, children chanted something that sounded like a spell. A woman in a cardigan carried a cake shaped like a pirate ship, and twenty small humans followed her as if she were leading them to a promised land of icing. Bill swallowed. “What if they ask me to breathe fire?” “Cough dramatically. Say dragons are cutting down for health reasons.” “And the presents?” Molly’s eyes gleamed. “During the party game. Musical statues. Everyone freezes. We glide in, collect the parcels, glide out.” Bill stared at her. “You think children will freeze because music stops?” “It’s the law of parties.” “They’ll freeze for three seconds, then one will bite my tail.” Molly considered this. “Wear the thick trousers.” Bill looked down at his scaly costume legs. “These are curtains from your aunt’s conservatory.” “Exactly. Sturdy.” A sudden burst of laughter rattled the windows. A little girl in fairy wings pressed her face to the glass and stared directly at them. Molly smiled and waved with the smooth confidence of someone who had once escaped a museum dressed as a duke. Bill tried to wave too, but the dragon claw slipped from his hand and landed in his lap with a squeak. The fairy girl narrowed her eyes. “She’s made us,” Bill whispered. “She’s six.” “Six-year-olds are excellent witnesses. They remember shoes.” Molly folded the menu-plan and tucked it into her tiara. “Then we improvise.” Bill groaned. “Your improvising once involved a goose.” “And we got the necklace.” “The goose got the necklace.” Molly opened the van door. The noise from the hall rushed in: sugar, balloons, and impending chaos. She set her shoulders, adjusted her princess sash, and lifted a sack decorated with cheerful snowmen. Bill shuffled after her, dragon head tucked under one arm, tail dragging through a puddle. “For the record, I preferred banks.” “Banks don’t have gift receipts.” At the entrance, the birthday child’s mother beamed with desperate gratitude. “Oh, wonderful, you’re here! The entertainers canceled.” Molly curtsied. “Princess Muffintop never cancels.” Bill muttered, “She reschedules felonies.” The doors opened wide. Twenty children turned. The present table glittered beyond them like a forbidden altar. Molly smiled grandly. “Hello, darlings,” she cried. “Who wants a magical surprise?” Every hand shot up. Bill leaned close. “Molly, I think we’re the surprise.” For the first time all afternoon, Molly looked uncertain. Then a toddler hugged her leg, leaving a perfect handprint of chocolate on her skirt. She sighed. “Fine. We do one game. Then larceny.”
created_at
"2026-04-24T21:41:04.455Z"
4/24/2026, 9:41:04 PM
model
"openai/gpt-5.5-20260423"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
8265
moderation_latency
(null)
generation_time
27955
tokens_prompt
77
tokens_completion
1392
native_tokens_prompt
66
native_tokens_completion
1138
native_tokens_completion_images
0
native_tokens_reasoning
260
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.03447
router
(null)
provider_responses
0
endpoint_id
"58e5b336-423e-430b-a2ab-8bc353f0c51b"
id
"resp_05cd76a2ec608b470169ebe3709a188196ab8a25fbf32e8891"
is_byok
false
latency
383
model_permaslug
"openai/gpt-5.5-20260423"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777066864-HowfIx3WBRGFk5dK16Li"
session_id
(null)
api_type
"completions"
id
"gen-1777066864-u5X3evvUiTsHr0ZKgMTp"
upstream_id
"resp_05cd76a2ec608b470169ebe3709a188196ab8a25fbf32e8891"
total_cost
0.03447
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
641
46.68%
Dialogue to Total Word Ratio
Ratio: 33.90%, Deviation: 3.90%
neededClean
false
wordsTotal
643
wordsDialogue
218
23.3390%