NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-24-01KQ0N11YQDXKS40T7KW3SEMJP
Run Details
Model:
GPT-5.5 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
29.3s
Cost:
$0.035430
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
1.45%
Dialogue to Total Word Ratio
(details)
0.7237%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly spread the map across the dashboard with the solemnity of a general planning an invasion. Unfortunately, the map had been drawn on the back of a takeaway menu, in purple crayon, by Bill, who had labeled the garden as “cake zone” and the house as “castle of tiny screaming people.” Outside, balloons bobbed from the gate of number 42. A banner read HAPPY 6TH BIRTHDAY, SOPHIE! The shrieks from the garden suggested either a party game or a small revolution. “Right,” Molly said. “We go in, we blend in, we locate the present table, and we leave with dignity.” Bill peered through binoculars the wrong way round. “Everyone looks very far away and judgmental.” “That’s because you’re using those incorrectly.” “I’m creating emotional distance.” Molly snatched the binoculars and focused on the garden. Children in paper crowns charged around a bouncy castle. Adults guarded paper plates. On a table beneath a gazebo sat a mountain of wrapped gifts, glittering like treasure. “There,” Molly whispered. “The loot.” Bill swallowed. “Some of those have unicorn paper.” “Stay professional.” “I am professional. I’m professionally concerned about stealing from children.” Molly gave him a look. “Last week you stole a garden gnome from a vicar.” “The gnome looked unhappy.” “It was ceramic.” “Exactly. Trapped.” A child wearing fairy wings sprinted past the gate, pursued by a boy with frosting in his hair. Bill ducked below the dashboard. “They’re organized,” he said. “They’ve got scouts.” “They’re six.” “So were several historical conquerors, probably.” Molly opened a battered holdall and removed their disguises: two aprons, a floppy chef’s hat, and a badge reading PARTY HELPERS. Bill held up the hat. “Why do I have to be the chef?” “Because you already smell faintly of sausage rolls.” “That’s my natural charisma.” “The plan is simple,” Molly said. “We walk in carrying this tray of jelly cups. We act useful. While they’re distracted by the entertainer, we transfer the presents into the holdall.” Bill glanced toward the garden. A man dressed as a pirate was being pelted with foam swords by a mob of children. “I don’t think the entertainer is distracting them,” Bill said. “I think they’re defeating him.” Molly watched as the pirate attempted to make a balloon dog, only to produce what looked like a sad blue intestine. “Fine. Then the distraction is his suffering.” Bill leaned back, troubled. “What if one of the presents is socks? We’ll have risked everything for socks.” “We take the good ones.” “How do you know?” “Weight. Shape. Rattle.” “That’s not theft, that’s Christmas archaeology.” A sudden cheer rose from the garden. The birthday girl had appeared, crowned in tinsel, carrying a plastic wand. She touched each present reverently before arranging them straighter on the table. Bill’s face crumpled. “No,” Molly said. “She loves them.” “Bill.” “She believes in wrapping paper.” “So do I. It conceals value.” The little girl hugged a lumpy parcel almost as big as herself. Molly looked away first. Bill said softly, “We’re monsters with a catering tray.” Molly folded the crayon map, slower than necessary. The pirate in the garden tripped over a chair and was immediately celebrated as if it were part of the act. “We could steal the cake,” Bill suggested. Molly stared at him. “Too far?” “Even we have standards.” Five minutes later, Molly and Bill marched up the path in aprons, carrying jelly cups and wearing smiles so false they squeaked. At the gate, Molly paused. “New plan,” she muttered. Bill nodded. “We help, we eat one sandwich each, and we leave without committing birthday treason.” “Two sandwiches,” Molly said. “I’m not a saint.” The gate swung open, and thirty children turned to stare. Bill raised the tray like a peace offering. “Hello,” he said. “We’re here about the jelly.” The children cheered. Molly sighed, then smiled despite herself. Somewhere beneath the gazebo, the presents remained untouched, glittering accusingly in the sun.
created_at
"2026-04-24T21:06:13.602Z"
4/24/2026, 9:06:13 PM
model
"openai/gpt-5.5-20260423"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
8981
moderation_latency
(null)
generation_time
29193
tokens_prompt
77
tokens_completion
1444
native_tokens_prompt
66
native_tokens_completion
1170
native_tokens_completion_images
0
native_tokens_reasoning
271
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.03543
router
(null)
provider_responses
0
endpoint_id
"58e5b336-423e-430b-a2ab-8bc353f0c51b"
id
"resp_03c6b663891030060169ebdb45af5c819484c1a21e8a61e0e8"
is_byok
false
latency
350
model_permaslug
"openai/gpt-5.5-20260423"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777064773-YxJir8RNJHQlh5KKD3Vz"
session_id
(null)
api_type
"completions"
id
"gen-1777064773-ecHNKfNuW795yCMTbyGW"
upstream_id
"resp_03c6b663891030060169ebdb45af5c819484c1a21e8a61e0e8"
total_cost
0.03543
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
653
1.45%
Dialogue to Total Word Ratio
Ratio: 39.20%, Deviation: 9.20%
neededClean
false
wordsTotal
653
wordsDialogue
256
0.7237%