NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-18-01KM0AMB2Z0ETW88XT4ST9MA1Y
Run Details
Model:
GPT-5.4 Mini (Reasoning, Low)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
5.1s
Cost:
$0.003897
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
8.54%
Dialogue to Total Word Ratio
(details)
4.2683%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
“Molly, tell me again why we’re doing this,” Bill whispered, crouched behind a hedge and peering at the brightly lit house. “Because,” Molly said, adjusting her black gloves with grave importance, “children’s parties are chaos, chaos is cover, and cover is how we get rich.” “Rich by stealing six-year-olds’ birthday gifts.” “Don’t say it like that. It sounds tacky.” “It is tacky.” “Bill, please. We are artists.” “Artists don’t usually target a moon-bounce and a clown named Gerald.” Molly squinted toward the backyard. “Gerald’s the weak link. He keeps trying to juggle oranges near the punch bowl.” “That’s not a weak link. That’s a lawsuit.” “Exactly. Everyone’s distracted.” Bill leaned closer. “What if one of the kids sees us?” “They won’t. Children are tiny, loud, and deeply self-absorbed.” “That’s cold.” “It’s accurate. Now, according to the plan, we enter through the side gate, stroll in like we belong there, and blend with the parents.” Bill frowned. “I don’t know how to blend with parents.” “Then nod a lot and hold a paper plate.” “A paper plate?” “Every adult at a children’s party is holding one. It’s the law.” Bill looked down at his hands. “I don’t have a paper plate.” “Then take one when we get inside.” “And if someone asks our names?” Molly gave him a dazzling smile. “We are the Pettersons.” “You said that with too much confidence.” “Confidence is half the costume.” “The other half?” Molly patted the oversized cake she’d hidden under her coat. “A cake. We brought a cake, Bill. No one questions people carrying a cake.” “That cake has a suspiciously hollow center.” “It’s for the presents.” “You hollowed out a cake for the presents?” “It’s a classic concealment technique.” “I’m pretty sure classic concealment involves bags.” “Boring bags. We’re here to make history.” Bill glanced again at the party. “I can hear sing-songing.” “Yes, they’re doing the birthday song.” “Should we feel bad?” Molly crossed her arms. “Do you feel bad when a bank has a nice lobby?” “We’re not stealing from a bank.” “No, we’re stealing from children, which is better. They don’t appreciate craftsmanship.” Bill stared at her. “You’re impossible.” “And you’re late. Put on your fake mustache.” “This looks like a caterpillar glued to my lip.” “Perfect. You look prosperous.” “I look like a man who’s been rejected by a puppet show.” “That, too, is prosperous.” Bill lowered his voice. “What’s the actual first step?” “We enter during the gift-opening lull, compliment the mother, locate the mountain of presents, and quietly transfer them to the getaway van.” “You mean your van.” “I’m the visionary. Visionaries own vans.” Bill shifted. “And if the kids notice the presents disappearing?” “Then you distract them.” “How?” Molly grinned. “Magic trick.” “I don’t know any magic tricks.” “You know one.” “I do?” “You can make a balloon animal.” “I can barely make a balloon apology.” “That’ll do. Children love incompetence in moderation.” Bill sighed, then straightened his tie. “All right. For the record, this is a terrible idea.” Molly peeked over the hedge again. “For the record, your standards are adorable. The coast is clear. Ready?” “No.” “Wonderful. Come on, Petterson. Time to rob the future.” They slipped through the side gate just as a burst of laughter rose from the backyard, Molly with her cake, Bill with his caterpillar mustache, both of them smiling like they had been invited, and for a terrible second it almost looked as if they had.
created_at
"2026-03-18T11:17:25.741Z"
3/18/2026, 11:17:25 AM
model
"openai/gpt-5.4-mini-20260317"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
2615
moderation_latency
(null)
generation_time
4879
tokens_prompt
77
tokens_completion
1220
native_tokens_prompt
66
native_tokens_completion
855
native_tokens_completion_images
0
native_tokens_reasoning
37
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.003897
router
(null)
provider_responses
0
endpoint_id
"9ee065b2-3d1c-43bc-bdd7-28af3b148282"
id
"resp_032617d0df5b9ce70169ba89c5d1ac8193b825d799ce06c6c2"
is_byok
false
latency
212
model_permaslug
"openai/gpt-5.4-mini-20260317"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773832645-vGxMtcFLE7EDoqachDOW"
upstream_id
"resp_032617d0df5b9ce70169ba89c5d1ac8193b825d799ce06c6c2"
total_cost
0.003897
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
577
8.54%
Dialogue to Total Word Ratio
Ratio: 77.02%, Deviation: 7.02%
neededClean
false
wordsTotal
583
wordsDialogue
449
4.2683%