NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-24-01KQ0NEMG7W568JHAWS3SBPP15
Run Details
Model:
GPT-5.5 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
27.7s
Cost:
$0.034320
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
57.38%
Dialogue to Total Word Ratio
(details)
28.6877%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the crayon-drawn map across the café table and pinned the corners with two muffins and Bill’s elbow. “Ow,” Bill said. “That’s the west gate.” “My elbow is the west gate?” “At present, yes. Try to look more fortified.” Bill peered at the map. “Why is the birthday child drawn as a dragon?” “Because she is turning seven, Bill. Seven-year-olds are basically dragons with better shoes.” “And the presents?” Molly tapped a pink square labeled LOOT MOUNTAIN. “Here. On the picnic table, between the juice boxes and the suspicious clown.” Bill shuddered. “I don’t like clowns.” “You don’t like balloons, streamers, cupcakes, or children singing in unison. Frankly, you’re overqualified for this job.” “I like cupcakes.” “You like frosting. You leave the cake bit like a criminal.” “I am a criminal.” “Not professionally. Professionally, you are a man who once got trapped in a revolving door while fleeing a stationery shop.” Bill straightened. “That door had tactics.” Molly leaned in. “Focus. We enter disguised as party entertainers.” “No.” “Yes.” “No, Molly.” “I have already bought the costumes.” Bill’s eyes narrowed. “What kind of costumes?” Molly lifted a large paper bag onto the table. A yellow duck head rolled out and stared at him with cheerful emptiness. Bill whispered, “You wouldn’t.” “I would. I did. You’re Quackers the Magical Duck.” “I am not Quackers.” “You are if you want half the presents.” “What are you?” “Princess Sparkleblade.” “That’s not fair. You get a sword.” “It’s foam.” “It’s still more dignified than a duck.” Molly patted his hand. “Dignity is what we spend after the heist.” Bill looked again at the map. “How are we meant to carry all the presents? There are twenty children invited. That’s at least twenty boxes, three gift bags, and one enormous teddy bear.” “We use the wagon.” “What wagon?” “The little red one by the sandbox.” “You mean the child’s wagon?” “It’s not a child’s wagon during the operation. It’s a tactical present relocation vehicle.” Bill frowned. “This feels low, even for us.” Molly blinked. “Bill, last week you stole a cheese wheel from a wedding.” “That cheese was unattended.” “It was on the bride’s table.” “She was dancing. The cheese looked lonely.” A bell jingled above the café door. Outside, in the park across the street, a woman in a party hat began tying balloons to a fence. Children chased each other in looping shrieks of delight. A small boy hugged a wrapped box nearly as big as himself. Bill watched him. “Look at that one.” “What about him?” “He’s protecting that present like it contains the crown jewels.” “It might contain a dinosaur.” Bill sighed. “Molly.” “What?” “Are we monsters?” Molly followed his gaze. The little boy tripped, dropped the present, and burst into tears. Three other children immediately ran over, offering napkins, a gummy worm, and what appeared to be a very sticky pinecone. Molly folded the map slowly. “We are not monsters.” “Good.” “We are opportunists with flexible morals.” “Molly.” She groaned. “Fine. We won’t steal the presents.” Bill relaxed. “Thank you.” “We’ll steal the cake.” “No.” “The balloons?” “No.” “The clown’s wallet?” “Molly.” She shoved the duck head back into the bag. “You’re impossible to burgle with when you develop a conscience.” “It happens every spring.” “Seasonal decency. Tragic.” Bill stood. “Come on.” “Where?” “To the party.” “I thought we weren’t robbing it.” “We’re not. But we have costumes, and that clown looks suspicious.” Molly grinned. “So we infiltrate as entertainers?” “To protect the presents.” “And if there are spare cupcakes?” “We accept payment in frosting.” Molly tucked the crayon map under her arm. “Quackers, this may be our most honorable heist.” Bill put on the duck head. His voice came muffled from inside. “If anyone asks, I’m undercover.” “Bill?” “Yes?” “You’re walking into a lamppost.” There was a hollow bonk. “I meant to do that,” said Quackers. “It’s called committing to the role.”
created_at
"2026-04-24T21:13:38.579Z"
4/24/2026, 9:13:38 PM
model
"openai/gpt-5.5-20260423"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
7760
moderation_latency
(null)
generation_time
27528
tokens_prompt
77
tokens_completion
1404
native_tokens_prompt
66
native_tokens_completion
1133
native_tokens_completion_images
0
native_tokens_reasoning
196
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.03432
router
(null)
provider_responses
0
endpoint_id
"58e5b336-423e-430b-a2ab-8bc353f0c51b"
id
"resp_028a159b5429c6cd0169ebdd02b344819784fb1b1cec7f3a2c"
is_byok
false
latency
305
model_permaslug
"openai/gpt-5.5-20260423"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777065218-Pgeb5J20Fj3dcUY5Dk9c"
session_id
(null)
api_type
"completions"
id
"gen-1777065218-OaS4gLKlfaGP1WPF7Xre"
upstream_id
"resp_028a159b5429c6cd0169ebdd02b344819784fb1b1cec7f3a2c"
total_cost
0.03432
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
654
57.38%
Dialogue to Total Word Ratio
Ratio: 66.67%, Deviation: 3.33%
neededClean
false
wordsTotal
657
wordsDialogue
438
28.6877%