NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-24-01KQ0NJ126TCNX4YMHCHWR8Y5C
Run Details
Model:
GPT-5.5
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
26.7s
Cost:
$0.030240
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
31.81%
Dialogue to Total Word Ratio
(details)
15.9041%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly unfolded the crayon-smeared invitation on the diner table and smoothed it with the seriousness of a museum curator handling a lost map. “Seven-year-old. Saturday. Two o’clock. Pirate theme,” she said. Bill peered over his mug. “Pirate theme? That’s insulting. We’re professionals.” “We’re opportunists with a bus pass.” “We are specialists in gift relocation.” “You tripped over a welcome mat last week and apologized to it.” “It looked expensive.” Molly tapped the invitation. “Focus. The presents will be stacked by the bouncy castle, according to my source.” “You have a source at a children’s party?” “My niece. She’s six. Very observant. Also easily bribed with stickers.” Bill leaned back. “I don’t like it. Kids are unpredictable. Sticky hands. Sudden screaming. Direct eye contact.” “That’s why we go during the puppet show.” “There’s a puppet show?” “Captain Wobbles and the Treasure of Sharing.” Bill winced. “That title alone is security.” Molly slid a napkin across the table. On it, she had drawn a crooked backyard, a fence, a cake table, and what might have been a dragon. “What’s that?” “The family dog.” “That is not a dog. That’s a haunted sausage.” “His name is Pickles. My niece says he bites balloons, mailmen, and people dressed as sharks.” “Why would anyone be dressed as a shark at a pirate party?” “Children’s entertainment has lost all respect for historical accuracy.” Bill studied the napkin. “So we enter through the side gate, avoid Pickles, scoop the presents, and leave?” “Not scoop. Casually migrate.” “You want to casually migrate thirty wrapped boxes?” “With a wagon.” “A wagon is not casual.” “At a child’s party? Everything has wheels. Strollers, scooters, tiny cars. We blend in.” Bill folded his arms. “What are our disguises?” Molly smiled. “Face paint.” “No.” “Yes.” “I am not committing burglary as a tiger.” “You’re not committing burglary at all if you keep saying it that loud.” Bill glanced around the diner. The waitress refilled coffee two booths away and ignored them with the practiced calm of someone underpaid. “I’ll be a pirate,” he said. “Too obvious.” “It’s a pirate party.” “Exactly. Every parent will assume pirates are hired staff and ask you to entertain children.” Bill shuddered. “Fine. Tiger.” “Tabby cat. Less threatening.” “I have a naturally threatening presence.” “You once got trapped in a revolving door because you kept following the arrow.” “It was giving mixed signals.” Molly checked her watch. “We need roles. I distract the adults by complimenting the dip.” “Bold. Risky. Parents are proud of dip.” “You handle the wagon.” “What if a child asks what I’m doing?” “Say you’re helping.” “What if they ask who you are?” “Say Aunt Molly.” “There may be a real Aunt Molly.” “Then say Uncle Bill.” “I’m Bill.” “Excellent. Authenticity.” Bill sighed. “And if the birthday boy cries?” Molly’s expression softened for half a second, then hardened professionally. “We don’t take the present he’s holding.” “That’s our moral line?” “We have standards.” “Do we?” “We also leave the educational toys.” Bill nodded. “Nobody should suffer twice.” Outside, a balloon bobbed past the diner window, escaped from some unseen celebration. Molly and Bill watched it drift skyward. Bill said, “What if the presents are just socks and puzzles?” “Then we’ve stolen disappointment.” “And if they’re expensive?” “Then we’ve liberated consumerism.” He stared at the napkin again. “Molly, this plan has holes.” “All plans have holes. That’s how you climb through them.” “What about parents taking pictures? We’ll be in every photo.” “That’s why I bought oversized sunglasses.” “For a children’s party?” “Parents are tired. They won’t question anything if you carry juice boxes.” Bill rubbed his temples. “I hate how that’s true.” Molly gathered the napkin and invitation. “Tomorrow, two o’clock. Comfortable shoes. No cape.” “I wasn’t going to wear a cape.” “You were thinking about it.” “It was a tactical cloak.” “It was a cape.” Bill stood, dropped coins beside his mug, and pointed at her. “If I get bitten by Pickles, you’re carrying me.” “If you get bitten by Pickles, I’m telling everyone you’re part of the puppet show.” He paused at the door. “Molly?” “What?” “Can we steal one cupcake?” She considered. “One each.” “With sprinkles?” “We’re thieves, Bill. Not monsters.”
created_at
"2026-04-24T21:15:29.75Z"
4/24/2026, 9:15:29 PM
model
"openai/gpt-5.5-20260423"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
507
moderation_latency
(null)
generation_time
26490
tokens_prompt
77
tokens_completion
1386
native_tokens_prompt
66
native_tokens_completion
997
native_tokens_completion_images
0
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.03024
router
(null)
provider_responses
0
endpoint_id
"58e5b336-423e-430b-a2ab-8bc353f0c51b"
id
"resp_0c1daba6ad3a6c790169ebdd71e32c8190b6029cf81b670c13"
is_byok
false
latency
253
model_permaslug
"openai/gpt-5.5-20260423"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777065329-iafdyJ3C1uQs9MxGVNCJ"
session_id
(null)
api_type
"completions"
id
"gen-1777065329-2Qtio8eljeKmlFPs13ZB"
upstream_id
"resp_0c1daba6ad3a6c790169ebdd71e32c8190b6029cf81b670c13"
total_cost
0.03024
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
701
31.81%
Dialogue to Total Word Ratio
Ratio: 74.79%, Deviation: 4.79%
neededClean
false
wordsTotal
702
wordsDialogue
525
15.9041%