NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-18-01KM07KEDTAB3259FR0Q22AEM1
Run Details
Model:
GPT-5.4 Nano (Reasoning, Low)
Test:
Dialogue tags
Scenario:
Write 200 words with 90% dialogue
Duration:
3.5s
Cost:
$0.000642
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
36.79%
Dialogue to Total Word Ratio
(details)
18.3940%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
“Molly, the party’s at Juniper Street, right?” “Yep, and the guests are tiny, loud, and armed with juice boxes.” “Perfect. We steal presents, not joy.” “Sure, Bill. Tell that to your conscience—he hides behind your left pant leg.” “I don’t have a conscience.” “You have a conscience-shaped hole. Now listen: we enter through the side gate.” “The one with the wind chime that squeaks like a haunted parrot?” “Exactly. I brought fishing line.” “You brought fishing line to silence wind chimes?” “I’m resourceful. You’re resourceful, too, in the way a shopping cart is ‘portable’.” “Thanks. What’s the plan for the decorations?” “We wear the inflatable birthday suits.” “Which ones?” “The ones we borrowed from the bodega and returned with less helium and more suspicious dents.” “Won’t the kids notice two adults in a saggy llama suit?” “They’ll assume it’s the ‘surprise mascot.’ Kids believe anything if you wave dramatically.” “I can wave.” “You can over-wave. Now, how do we open the gift table?” “We don’t. We ‘mysteriously reorganize’ it.” “Sounds like theft with jazz hands.” “Jazz hands are nonviolent.” “And the security?” “Sugar-free cupcakes in the foyer. No guard exists beyond frosting.” “Alright. Timers?” “On my mark.” “Your mark is where?” “At the moment everyone sings. They’ll be distracted by carols and glitter.” “Glitter is basically a crime-proofing camouflage.” “Let’s do this, Bill.” “After you, Molly.” “I always go first—risk management.” “Call it what you want. Just promise you won’t lick the frosting.” “I don’t lick frosting.” “You licked frosting last time.” “That was espionage.” “Fine. Two seconds—count.” “Three… two… one… go!”
created_at
"2026-03-18T10:24:30.662Z"
3/18/2026, 10:24:30 AM
model
"openai/gpt-5.4-nano-20260317"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
2746
moderation_latency
(null)
generation_time
3132
tokens_prompt
77
tokens_completion
656
native_tokens_prompt
66
native_tokens_completion
503
native_tokens_completion_images
0
native_tokens_reasoning
64
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.00064195
router
(null)
provider_responses
0
endpoint_id
"0c835f2e-c18d-4e8c-b245-e1e3bd08b97f"
id
"resp_08fadc58c6a882160169ba7d5ecea08196af4e48738bd508af"
is_byok
false
latency
193
model_permaslug
"openai/gpt-5.4-nano-20260317"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773829470-r5ZJQfFX47uOTj9qG7tp"
upstream_id
"resp_08fadc58c6a882160169ba7d5ecea08196af4e48738bd508af"
total_cost
0.00064195
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
265
36.79%
Dialogue to Total Word Ratio
Ratio: 100.00%, Deviation: 10.00%
neededClean
false
wordsTotal
268
wordsDialogue
268
18.3940%