NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-24-01KQ0MZ41AJ5QD3Z755SMAZNS4
Run Details
Model:
GPT-5.5 (Reasoning, Low)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
29.9s
Cost:
$0.033120
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
9.97%
Dialogue to Total Word Ratio
(details)
4.9841%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly unfolded the crayon-smeared invitation on the café table like it was a royal map. A purple dinosaur smiled up at them, wearing a party hat and holding a balloon that said: OLIVER IS SIX! Bill squinted. “That dinosaur knows too much.” “That dinosaur,” Molly said, tapping the paper, “is our ticket to the biggest present pile in North Finchley.” “It’s a children’s party, Molly.” “It’s a fortress of wrapping paper and consumer excess.” “It’s in a church hall.” “Exactly. No one suspects crime near squash and fairy cakes.” Bill stirred his tea with the tiny spoon he always carried for emergencies. “I’m not saying no. I’m saying we need standards.” “You once stole a wheel of cheese from a petting zoo.” “That goat was asking for it.” Molly leaned in. “Listen. Thirty children. Two parents each. Grandparents. Aunties. Uncles. That’s potentially ninety presents.” “Or one very large present from a rich uncle.” “Either way, we win.” Bill looked again at the invitation. “It says fancy dress. Theme: pirates and princesses.” Molly smiled. “Perfect.” “No. Absolutely not.” “Yes.” “I’m not wearing the wig again.” “You looked regal.” “I looked like a haunted spaniel.” “You’ll be a pirate, then. I’ll be a princess.” “You always get to be the princess.” “Because I have leadership qualities and a tiara-shaped head.” Bill sighed. “Fine. Suppose we go in costume. What’s the plan?” Molly lowered her voice, though the only other customer was an elderly man arguing with a scone. “We enter at two fifteen, when the magician is setting up.” “There’s a magician?” “According to intelligence.” “Your intelligence is a six-year-old’s invitation.” “It says ‘Magic Mike.’” Bill choked on his tea. “That can’t be right for children.” “It’s probably Michael. Local. Card tricks. Rabbit. Sad waistcoat.” “Continue.” “We mingle. We laugh. We say, ‘Lovely party, isn’t it?’ Then, while everyone watches the rabbit disappoint them, we relocate the presents.” “Relocate?” “To our van.” “We don’t have a van.” “To your hatchback.” “My hatchback still smells of cheese.” “That’s called camouflage.” Bill drummed his fingers. “What if someone asks who we are?” “We’re friends of the family.” “Which family?” “The birthday family.” “What’s their surname?” Molly checked the invitation and frowned. “It just says Oliver.” “So we’re friends of the Olivers?” “We’re from school.” “We are visibly forty.” “Teachers, then.” “You want to rob children while pretending to be teachers?” “It has gravitas.” Bill leaned back. “This is spiraling.” “This is planning.” “This is morally complicated.” Molly blinked. “Bill, last week you pickpocketed a mime.” “He trapped me in an invisible box.” “The presents will be insured.” “They’re not crown jewels. They’re plastic dinosaurs and sticker books.” “Exactly. Easy to carry.” Bill rubbed his face. “What if Oliver cries?” “Children cry all the time. It’s practically their hobby.” “What if he opens nothing? What if he stands there in his tiny pirate hat, surrounded by cake crumbs, asking why Father Christmas hates April?” Molly hesitated. Bill pointed at her. “You’re imagining it.” “I am not.” “You are. His little lip wobbling.” “Stop it.” “His grandmother saying, ‘Never mind, dear, perhaps the princess took them by mistake.’” Molly folded the invitation. “You fight dirty.” “I fight accurately.” There was a pause. Outside, rain began tapping the window like impatient fingers. Molly said, “All right. Revision.” Bill narrowed his eyes. “What sort of revision?” “We still attend.” “No stealing?” “No stealing from children.” “Good.” “We steal from the magician.” Bill considered this. “What’s he got?” “Doves. Coins. Maybe a collapsible wand.” “I don’t want a dove.” “Fine. We don’t steal. We case the joint.” “For what?” “For leftover cake.” Bill’s expression softened. “Cake isn’t stealing if it’s abandoned.” “Exactly.” “And if there are party bags?” “We accept only if offered.” “And we don’t make Oliver cry.” “Unless the magician is terrible.” Bill picked up the invitation. “So the plan is: dress as pirates and princesses, infiltrate a six-year-old’s party, eat cake, applaud politely, and leave with legally obtained sweets?” Molly raised her teacup. “The perfect crime.” Bill clinked his cup against hers. “Molly, sometimes I worry we’re getting soft.” “Nonsense. We’re evolving.” “To what?” She smiled. “Catered criminals.”
created_at
"2026-04-24T21:05:10.195Z"
4/24/2026, 9:05:10 PM
model
"openai/gpt-5.5-20260423"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
4715
moderation_latency
(null)
generation_time
29839
tokens_prompt
77
tokens_completion
1541
native_tokens_prompt
66
native_tokens_completion
1093
native_tokens_completion_images
0
native_tokens_reasoning
73
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.03312
router
(null)
provider_responses
0
endpoint_id
"58e5b336-423e-430b-a2ab-8bc353f0c51b"
id
"resp_060ea2145895463a0169ebdb0649648190a7f908d13a94af20"
is_byok
false
latency
366
model_permaslug
"openai/gpt-5.5-20260423"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777064710-MJAxZEiKANgZEYqpQE7x"
session_id
(null)
api_type
"completions"
id
"gen-1777064710-2RLu9QStFW7dYtBX1Jmk"
upstream_id
"resp_060ea2145895463a0169ebdb0649648190a7f908d13a94af20"
total_cost
0.03312
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
692
9.97%
Dialogue to Total Word Ratio
Ratio: 76.79%, Deviation: 6.79%
neededClean
false
wordsTotal
698
wordsDialogue
536
4.9841%