NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-18-01KM0A653BZ6WN7EPD3WD5C2QN
Run Details
Model:
GPT-5.4 Mini
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
7.7s
Cost:
$0.005850
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.13%
Dialogue to Total Word Ratio
(details)
0.0671%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
“Tell me again why we’re doing this,” Molly said, peering through the cracked blinds at the rainbow-colored house across the street. “Because,” Bill whispered, adjusting the cheap fake mustache he had bought from a joke shop, “children have the best loot.” “Loot?” Molly said. “They’re six.” “Exactly,” Bill said. “They don’t know value. That makes them dangerous to capitalism and adorable to burglars.” Molly snorted. “You said this was a simple score. In and out. Quick as a wink.” “It is a simple score.” “Then why are you carrying a grappling hook?” Bill looked down at the grappling hook clutched in both hands. “Because I want options.” “Options for what? The party is in the back garden.” “Still,” Bill said, “a professional does not arrive empty-handed.” “A professional also does not wear clown shoes.” “These are stealth clown shoes.” “Bill, they squeak.” “They squeak quietly.” “They squeak in three different pitches.” Bill leaned closer to the window. “Listen. There’s a bouncy castle. There’s a magician. There’s a table full of presents. We take the presents, we vanish, and by tea time we are legends.” “We are not legends,” Molly said. “We’re two thieves dressed like escaped lawn gnomes.” “Only because you insisted on green,” Bill said. “I insisted on dark green. You bought lime.” “It blends with shrubbery.” “In a room full of seven-year-olds?” Bill frowned. “Fair.” A burst of shrieking laughter rose from the yard. A child in a pirate hat ran past the window carrying a slice of cake the size of his face. Molly narrowed her eyes. “Okay. New question. How many presents are we actually talking about?” Bill counted on his fingers. “Twelve visible. Maybe more in the hallway. The big one has a bicycle on it.” “A bicycle,” Molly repeated softly. “A very expensive bicycle,” Bill said. “You can tell by the smug ribbon.” Molly folded her arms. “We are not stealing from a child’s party.” Bill stared at her. “Molly.” “No.” “Molly.” “No.” “Molly, think of the efficiency.” “I am thinking of the optics.” “The what?” “The optics, Bill. The general public does not like ‘mysterious adult removes toys from birthday celebration.’” “They won’t know it was us.” Molly pointed at his mustache. “You are one sneeze away from looking like a lost potato.” Bill straightened. “I’ll have you know this disguise has worked before.” “Name one time.” Bill hesitated. “The miniature golf fundraiser.” “You were banned for crying in the windmill.” “It was very emotional.” Molly sighed. “What’s the plan?” Bill perked up. “Excellent. We wait until the magician does the big rabbit trick. While everyone’s looking left, we go right. You distract the parents.” “How?” “Compliment their canapés.” “I don’t know what a canapé is.” “Then we’re already ahead of the game.” “And you?” “I’ll approach the gifts with confidence.” “That’s not a plan.” “It is if you say it with confidence.” Molly rubbed her temple. “How do we carry twelve presents without attracting attention?” Bill produced two enormous Santa sacks from behind the hedge. Molly blinked. “You brought sacks.” “Of course I brought sacks.” “Why are they embroidered?” Bill looked defensive. “One was on sale.” “They say ‘Naughty Acquisition Unit.’” “It’s a niche market.” Molly laughed, then tried to stop. “You are impossible.” “And yet,” Bill whispered, “you’re still here.” She looked back at the house, where a little girl in a tiara was now trying to feed frosting to the magician’s rabbit. Molly exhaled. “Fine. We do it your way.” Bill grinned. “Splendid. You’ll see. Clean, elegant, nearly criminal.” “Nearly?” Bill adjusted his mustache again. “There may be one issue.” “What issue?” He pointed toward the gate. “The birthday boy’s father is a retired detective.” Molly turned slowly. “You knew that?” “I may have heard a rumor.” “You knew that and you brought clown shoes.” Bill gave a helpless shrug. “I thought they’d make us memorable.” Molly stared at him, then at the house, where the detective father was currently inflating balloons with the grim focus of a man who had survived three divorces and one particularly ugly goose. “Bill,” she said quietly, “if we go in there, and we get caught, I’m telling everyone this was your idea.” Bill nodded solemnly. “That’s fair.” “And if we get away?” “Then,” Bill said, lifting his sack with theatrical pride, “we split the spoils and deny everything.” Molly reached into her coat, pulled out a pair of sunglasses, and put them on. “All right.” Bill whispered, “You’re serious now.” “No,” she said, “but I look expensive, and that usually helps.” He gave her a quick, delighted grin. “That’s my partner.” “Don’t make this sentimental.” “Never.” From inside the party came the magician’s voice: “And now, for my final trick!” Molly checked the house, then the sacks, then Bill’s absurd mustache. “After you,” she said. Bill bowed, stepped forward, and immediately tripped over a garden gnome. “Professional,” Molly muttered. “Still counts!” Bill hissed from the flowerbed. They crouched together in the hedges, while inside the children screamed with joy and the magician apparently set off the smoke machine. Molly took a breath. “Bill?” “Yes?” “If there’s one single llama involved, I’m leaving you.” Bill nodded gravely. “Understood.” Then, with the solemnity of saints and the judgment of fools, they slipped through the open gate and into the party, aiming for the presents, the bicycle, and a retirement plan no respectable adult would ever admit to.
created_at
"2026-03-18T11:09:40.854Z"
3/18/2026, 11:09:40 AM
model
"openai/gpt-5.4-mini-20260317"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
408
moderation_latency
(null)
generation_time
7514
tokens_prompt
77
tokens_completion
1715
native_tokens_prompt
66
native_tokens_completion
1289
native_tokens_completion_images
0
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.00585
router
(null)
provider_responses
0
endpoint_id
"9ee065b2-3d1c-43bc-bdd7-28af3b148282"
id
"resp_0029e8801b4b7c2a0169ba87f4e86881909b961205fe976ba3"
is_byok
false
latency
174
model_permaslug
"openai/gpt-5.4-mini-20260317"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773832180-n1wF6GLH1DfZBujJuJIh"
upstream_id
"resp_0029e8801b4b7c2a0169ba87f4e86881909b961205fe976ba3"
total_cost
0.00585
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
902
0.13%
Dialogue to Total Word Ratio
Ratio: 58.50%, Deviation: 11.50%
neededClean
false
wordsTotal
906
wordsDialogue
530
0.0671%