NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-18-01KM09J69D8VPPGBKHMXE5A4WV
Run Details
Model:
GPT-5.4 Nano (Reasoning)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
10.8s
Cost:
$0.002042
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.01%
Dialogue to Total Word Ratio
(details)
0.0047%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly crouched behind the hedges, peering through a gap in the leaves like she was scouting a war zone, except the battlefield smelled like sugar and panic. Bill wiggled next to her, wearing a cardboard crown that said “PROPERTY OF NO ONE,” which seemed less like stealth and more like a confession. “Bill,” Molly whispered, “why are you wearing the crown. The crown is telling everyone we’re here.” “It’s a stealth crown,” Bill said. “No one suspects a monarch.” Molly stared. “Monarchs are the least stealthy people on earth.” Bill grinned. “Not if they’re undercover. Like… ninjas with taxes.” Molly nudged him with her elbow. “Listen. This is a children’s party. They have… whistles.” “They also have frosting,” Bill said, licking his lips. “I’m willing to risk the whistles for frosting.” Molly adjusted her gloves. “We are not risking anything. We are taking presents, quietly, efficiently, and then leaving before anyone asks us to do face painting.” Bill brightened. “I can do face painting.” Molly blinked. “You can?” “I can paint a face,” Bill said proudly. “Usually my own. Sometimes a wall. But with practice—” “No practice,” Molly hissed. “We are thieves, not entertainment staff.” Bill leaned in. “What if the parents hired us? Like… as a surprise?” Molly looked toward the backyard where balloons bobbed like suspicious jellyfish. “Parents do not hire surprise thieves. Parents hire clowns. Clowns are just thieves with better marketing.” Bill gasped. “So we should become clowns.” Molly sighed. “We should not become clowns.” Bill shrugged. “Okay, then maybe we become… helpful mascots.” Molly’s eyes narrowed. “A mascot is just a clown with a license.” Bill tapped the side of his head. “Then I have a license.” Molly folded her arms. “Where is the license?” Bill patted his jacket. “It’s in my wallet.” Molly’s mouth opened, then closed. “Your wallet is not in your jacket.” Bill paused, then slowly pulled out a wallet that was, in fact, in his jacket. It looked battered, and the crown had left a tiny smear of glitter on the corner. Molly pointed. “That’s glitter. Glitter is not a license.” Bill smiled. “It’s a confidence license.” A burst of laughter floated from the house. Someone announced, “Okay, everyone! It’s present time!” Molly flinched. “Bill. They’re about to open them.” Bill’s eyes widened. “We’re early! That means we can snag the presents before they become—” “—before they become loudly owned,” Molly finished. Bill nodded vigorously. “Yes. Loudly owned. Very important distinction.” Molly checked the yard again. “We need a plan.” Bill put a finger to his chin. “Step one: we blend in.” Molly raised a brow. “With what. Your crown?” “With my aura,” Bill said. “My aura says ‘I belong in a coloring book.’” Molly muttered, “Your aura says ‘I should be supervised by adults.’” Bill looked offended. “I am supervised.” Molly glanced at him. “By whom?” Bill pointed upward. “The universe.” Molly opened her mouth, then closed it, as if her brain was trying to escape through her ears. “Fine. Blend in. With words.” Bill brightened. “I have words!” Molly looked hopeful for one miraculous second. “Good.” Bill swallowed. “I’ll say, ‘Who wants to help me carry these?’” Molly groaned. “Bill, ‘help me carry’ is what people say right before they commit crimes in movies.” Bill leaned closer. “Movies are for learning.” Molly hissed, “Movies are for getting arrested. We are not—” A child’s voice rang out, “Do you have balloons?” Bill froze. “Balloons?” Molly winced. “Bill, we do not have balloons.” Bill snapped his fingers like an inventor. “Okay, new plan. We borrow balloons.” Molly stared at him. “Borrowing balloons from children is… worse.” Bill nodded quickly. “Right. Then we borrow them from the balloons.” Molly’s face did something between a blink and a headache. “Bill. You cannot steal from helium.” Bill shrugged. “Not stealing. Negotiating.” Molly leaned in, whispering with the intensity of a secret handshake. “We go in, we take the presents, we do it fast, we do not talk to anyone unless necessary, and we do not—” Bill interrupted, eyes sparkling. “—do not get recruited into group activities.” Molly smiled despite herself. “Now you’re learning.” Bill pointed toward the front door. “Okay. You go first. You look like you know what you’re doing.” Molly adjusted her hat. “I do.” Bill looked at her hat, which read “MOLLY’S PARTY HELPERS,” in cheerful font. “Your hat says you’re party help.” Molly exhaled slowly. “It’s called camouflage.” Bill nodded solemnly. “Camouflage. Like a tiger wearing a nametag.” Molly stood, smoothing her sleeves. “If anyone asks, we’re here for the… surprise.” Bill followed her, whispering, “What surprise?” Molly glanced at him. “The surprise that we’re leaving.” Bill gave a thumbs-up. “Love that surprise.”
created_at
"2026-03-18T10:58:46.715Z"
3/18/2026, 10:58:46 AM
model
"openai/gpt-5.4-nano-20260317"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
9133
moderation_latency
(null)
generation_time
10036
tokens_prompt
77
tokens_completion
1855
native_tokens_prompt
66
native_tokens_completion
1623
native_tokens_completion_images
0
native_tokens_reasoning
451
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.00204195
router
(null)
provider_responses
0
endpoint_id
"0c835f2e-c18d-4e8c-b245-e1e3bd08b97f"
id
"resp_0c9ea0a18e7240170169ba8566ca248197a77f2b996351de13"
is_byok
false
latency
225
model_permaslug
"openai/gpt-5.4-nano-20260317"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773831526-wrQdXDnxh2C6lqN6AdtC"
upstream_id
"resp_0c9ea0a18e7240170169ba8566ca248197a77f2b996351de13"
total_cost
0.00204195
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
783
0.01%
Dialogue to Total Word Ratio
Ratio: 56.38%, Deviation: 13.62%
neededClean
false
wordsTotal
784
wordsDialogue
442
0.0047%