NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-24-01KQ0P76MFNV6KMWGPR7HYCTK9
Run Details
Model:
GPT-5.5 (Reasoning)
Test:
Dialogue tags
Scenario:
Write 200 words with 50% dialogue
Duration:
39.1s
Cost:
$0.080190
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
100.00%
Matches word count
(details)
100.00%
Dialogue to Total Word Ratio
(details)
99.9979%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly unfolded the crayon map on the café table, where a sticker dinosaur marked Lily Pepper’s seventh birthday. Bill squinted over cocoa moustache. "That is not a plan," he said. "That is a rainbow having a seizure." "It is intelligence," Molly said. "My niece drew it under contract." "The presents are by a bouncy castle," Bill said. "Children bounce. We don't." Molly set down two paper hats and a wand, all stolen from the café's emergency whimsy basket. "We enter as entertainers," she said. "I entertain nobody." "You made a bailiff cry." "Accounting made him cry." Rain ticked at the window. On the map, Molly circled a square labeled CAKE in red jam. "We distract the magician, charm the grandmothers, ignore the terrier, collect the loot." "The loot has gift tags." "Temporary loot." "From children." "For cake ransom." Bill put on a hat. It split, drooping over one eye like festive seaweed. Molly nodded gravely. "Perfect. You look innocent." "I look concussed." "If anyone asks for a balloon animal, make a worm." "All balloon animals are worms." "Exactly. Confidence." "Bring the extremely large pockets," Molly said softly. Bill saluted bravely with his teaspoon. Molly pocketed the map, smelling buttercream and disaster.
created_at
"2026-04-24T21:27:03.576Z"
4/24/2026, 9:27:03 PM
model
"openai/gpt-5.5-20260423"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
38067
moderation_latency
(null)
generation_time
38383
tokens_prompt
77
tokens_completion
883
native_tokens_prompt
66
native_tokens_completion
2662
native_tokens_completion_images
0
native_tokens_reasoning
2372
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.08019
router
(null)
provider_responses
0
endpoint_id
"58e5b336-423e-430b-a2ab-8bc353f0c51b"
id
"resp_053d781428023c890169ebe027a99c8190bf78c60e100b7005"
is_byok
false
latency
365
model_permaslug
"openai/gpt-5.5-20260423"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777066023-f2rYajkzLQ7tEbbQ83ea"
session_id
(null)
api_type
"completions"
id
"gen-1777066023-tQzEidHze7T1jDOohxZp"
upstream_id
"resp_053d781428023c890169ebe027a99c8190bf78c60e100b7005"
total_cost
0.08019
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Matches word count
n/a
neededClean
false
words
201
100.00%
Dialogue to Total Word Ratio
Ratio: 50.75%, Deviation: 0.75%
neededClean
false
wordsTotal
201
wordsDialogue
102
99.9979%