NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-24-01KQ0NBW97GFYM0YBW8E64A4N2
Run Details
Model:
GPT-5.5
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
28.9s
Cost:
$0.031770
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
97.24%
Dialogue to Total Word Ratio
(details)
48.6195%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly unfolded the crayon-drawn map on the dashboard and pinned the corners with two stale doughnuts. Across the street, balloons bobbed on the mailbox of Number 14, and a cardboard sign shouted: HAPPY 6TH BIRTHDAY, ARCHIE! Bill squinted through the windshield. “That bouncy castle is new.” “It’s not a bouncy castle,” Molly said. “It’s an inflatable surveillance blind spot.” “It has a giraffe on it.” “A giraffe with excellent cover potential.” Bill scratched his chin. “I still think this is low.” “It’s a reconnaissance exercise.” “You said we were stealing all the presents.” “I said we were liberating unguarded consumer goods from a hostile sugar environment.” “That’s worse.” Molly tapped the map. “Listen carefully. At fourteen hundred hours, the children will be distracted by the magician.” “What if he’s good?” “He calls himself the Great Zucchini.” “What if he’s very good?” “Then we steal him too and sell him to a cruise ship.” Bill looked horrified. “Molly.” “Kidding. Mostly.” A shriek of laughter erupted from the garden, followed by a grown man saying, “No, don’t put cake in the bubble machine!” Bill winced. “They have cake-based defenses.” “Every job has risks.” “This isn’t a job. This is a birthday party.” “Exactly. Nobody suspects two adults carrying a laundry basket.” “They might suspect us if the basket is full of wrapped dinosaurs and glitter unicorns.” “That’s why we use the decoy.” Bill looked into the back seat. A teddy bear the size of a small landlord stared back at him, wearing sunglasses. “You put sunglasses on Mr. Snuffles.” “He’s our inside man.” “He’s a bear.” “So was my last accountant, and he got us through customs.” Bill leaned closer to the map. “What’s this red X?” “Present table.” “And the blue squiggle?” “Chocolate fountain.” “You marked the chocolate fountain more clearly than the exit.” “Priorities keep people alive.” “No, exits keep people alive.” Molly peered over the steering wheel. “Movement at the gate. Small pirate, two princesses, one dinosaur. No adult supervision.” “They’re six.” “Exactly. Unpredictable.” Bill sighed. “Maybe we could just not steal from children.” Molly stared at him. “Are you going soft?” “I’m going reasonable.” “You once stole a chandelier from a hotel lobby during a wedding.” “That chandelier was rude.” “It sparkled at you?” “It judged me.” Molly lowered her voice. “Bill, focus. We slip in as entertainers.” “We don’t have costumes.” She reached under her seat and produced two red noses. “No.” “Yes.” “I am not committing party fraud dressed as a clown.” “Not fraud. Character work.” “I have principles.” “You ate evidence yesterday.” “It was a sandwich.” “It was Exhibit B.” Bill folded his arms. “What’s the getaway plan?” “We exit through the side gate, load the presents into the pram, and stroll away.” “We don’t have a baby.” Molly pointed to the teddy bear. “Mr. Snuffles is age-fluid.” “That is not a baby. That is a bear with tax problems.” From the house came a chorus of children shouting, “Presents! Presents! Presents!” Molly’s eyes narrowed. “They’re moving the timetable up.” Bill looked panicked. “Good. Once the children open the presents, there’s nothing to steal.” “That is defeatist.” “That is moral.” Molly grabbed the laundry basket. “Come on.” Bill grabbed her sleeve. “Wait.” “What?” “What if we steal something else?” “Like what?” “The leftover sausage rolls. Maybe the party bags. Adults won’t cry about those.” “Children will absolutely cry about party bags.” “Fine. We steal the magician’s rabbit if it looks unhappy.” Molly hesitated. “Rabbit rescue?” “Rabbit rescue. And sausage rolls.” She considered the house. A child screamed, “The rabbit did a wee in Dad’s shoe!” Bill nodded gravely. “That rabbit is sending a message.” Molly put the laundry basket down. “All right. New plan. We infiltrate, assess the rabbit’s working conditions, acquire snacks, and leave the presents.” Bill smiled. “That’s growth.” “That’s diversification.” They climbed out. Molly stuck on her red nose. Bill reluctantly did the same. At the gate, a woman with frosting on her elbow rushed toward them. “Oh thank goodness, are you the clowns?” Molly bowed. “Madam, we are whichever clowns you require.” Bill muttered, “Ethically flexible clowns.” The woman shoved a tray into his hands. “Take these sausage rolls to the garden. And please keep the children away from the rabbit. He’s unionizing.” Molly beamed at Bill. Bill whispered, “Best heist ever.” “Don’t get sentimental,” Molly whispered back. “We haven’t met the rabbit.”
created_at
"2026-04-24T21:12:08.253Z"
4/24/2026, 9:12:08 PM
model
"openai/gpt-5.5-20260423"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1136
moderation_latency
(null)
generation_time
28778
tokens_prompt
77
tokens_completion
1442
native_tokens_prompt
66
native_tokens_completion
1048
native_tokens_completion_images
0
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.03177
router
(null)
provider_responses
0
endpoint_id
"58e5b336-423e-430b-a2ab-8bc353f0c51b"
id
"resp_05e6fb7e671a32450169ebdca877dc8194ad4522b20554d428"
is_byok
false
latency
520
model_permaslug
"openai/gpt-5.5-20260423"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777065128-h7WQ914ammkL5AoKJS9I"
session_id
(null)
api_type
"completions"
id
"gen-1777065128-WvdFpseoj5eCQ3IbAhYi"
upstream_id
"resp_05e6fb7e671a32450169ebdca877dc8194ad4522b20554d428"
total_cost
0.03177
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
732
97.24%
Dialogue to Total Word Ratio
Ratio: 69.25%, Deviation: 0.75%
neededClean
false
wordsTotal
735
wordsDialogue
509
48.6195%