NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-24-01KQ0PEVFWRJQDNVPNSZVQ5EBG
Run Details
Model:
GPT-5.5
Test:
Dialogue tags
Scenario:
Write 200 words with 10% dialogue
Duration:
11.3s
Cost:
$0.009330
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
75.16%
Matches word count
(details)
96.35%
Dialogue to Total Word Ratio
(details)
85.7507%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 10% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly crouched behind the hedge, binoculars trained on the church hall where balloons bobbed like suspiciously cheerful sentries. Bill lay beside her in a borrowed clown suit, sweating glitter and regret. Inside, twenty sugar-charged children circled a mountain of wrapped presents. The cake was taller than most of them and decorated with a pirate ship, which Molly considered insulting to professional criminals. “Plan?” Bill whispered. “Enter, distract, exit,” Molly said. “With presents?” “Obviously.” Their getaway wagon waited in the alley, painted with the words FUNTIME PUPPET MINISTRY because Bill had misunderstood “low profile.” Molly adjusted her party hat and checked the inventory: smoke pellets, gift sacks, two juice boxes, and a small plastic kazoo for emergencies. At precisely two o’clock, the magician arrived, tripped over his rabbit cage, and released four doves into the punch. Molly smiled. Chaos had RSVP’d. They slipped through the side door. A toddler immediately handed Bill a paper crown and declared him King Banana. Bill bowed, which earned applause and sticky fingerprints. “Focus,” Molly hissed. “I’m beloved.” Molly reached for the nearest present, but it began barking. The tag read: To Emma, Love Grandma. Air holes. Molly froze. Bill lifted another box; it meowed furiously. Outside, sirens wailed. Molly sighed. Children’s parties, she reflected, had dangerously advanced security.
created_at
"2026-04-24T21:31:14.308Z"
4/24/2026, 9:31:14 PM
model
"openai/gpt-5.5-20260423"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
733
moderation_latency
(null)
generation_time
11207
tokens_prompt
77
tokens_completion
390
native_tokens_prompt
66
native_tokens_completion
300
native_tokens_completion_images
0
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.00933
router
(null)
provider_responses
0
endpoint_id
"58e5b336-423e-430b-a2ab-8bc353f0c51b"
id
"resp_063a9473fa163ab80169ebe12262848193ae977687c8a1d65a"
is_byok
false
latency
231
model_permaslug
"openai/gpt-5.5-20260423"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777066274-BXIovBN8hwSrm6XZcHUo"
session_id
(null)
api_type
"completions"
id
"gen-1777066274-1foWYG3MGtifelnUEnvE"
upstream_id
"resp_063a9473fa163ab80169ebe12262848193ae977687c8a1d65a"
total_cost
0.00933
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
75.16%
Matches word count
n/a
neededClean
false
words
213
96.35%
Dialogue to Total Word Ratio
Ratio: 5.61%, Deviation: 4.39%
neededClean
false
wordsTotal
214
wordsDialogue
12
85.7507%