NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-09-01KKACBF96EESP4PRJ891B7YMY
Run Details
Model:
GPT-5.4 (Reasoning, Low)
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
24.0s
Cost:
$0.014445
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
94.37%
Dialogue to Total Word Ratio
(details)
47.1831%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Rain tapped the bakery awning like impatient fingers while Molly and Bill hunched over a sticky café table, studying a crayon map of Cedar Grove Community Hall. The map had been drawn by Bill on the back of a takeout menu and improved, in his opinion, by the addition of three arrows, a skull, and a cupcake. “This,” said Molly, tapping the menu with a spoon, “is not a map. This is what happens when a panic attack borrows a red marker.” “It’s conceptual,” Bill said. “Modern criminals use abstraction.” “Modern criminals also spell ‘emergency exit’ correctly.” He leaned in, squinting. “That says ‘surprise hatch.’ More festive.” Outside, a child in a paper crown splashed through a puddle. Molly watched the crown bob away and lowered her voice. The party was in forty minutes. Inside the hall, according to Bill’s source—a divorced balloon artist with loose ethics and a tighter alimony schedule—there would be fourteen children, six distracted adults, one clown, and a mountain of presents stacked beneath a rented cardboard castle. “The key variables,” Molly said, “are noise, speed, and whether you can go ten minutes without improvising.” “I improvise beautifully.” “You once improvised us into a dog show.” “And we left with six silver bowls.” “Because a poodle liked your shoelace.” Bill sat back smugly. “Charisma is a tool.” Molly unfolded the real floor plan, stolen from the council website and smelling faintly of printer ink and determination. The hall had a side entrance by the bins, a kitchen corridor, and double doors opening into the party room. The presents would be near the cake table. Their van waited in the alley, painted with the words TIDY TERRY’S DRAIN RESCUE, because no one looked closely at plumbing. “We go in as entertainers,” Bill said. “Simple.” “You are not an entertainer.” “I can juggle.” “You can drop things in sequence.” He ignored that. “Parents expect chaos. A little confusion is camouflage. While they’re singing, we sweep the presents into contractor sacks, out the side door, into the van, gone before the clown finishes inflating his first dachshund.” Molly considered him over the rim of her teacup. Bill had the dangerous confidence of a man who believed a hat improved every plan. Tonight’s hat was a purple bowler he’d found in a taxi. “We are not wearing disguises that can be described by sobbing witnesses,” she said. “They’ll be delighted. We’ll say we’re from Captain Starburst’s Surprise Crew.” “That sounds like a lawsuit.” “It sounds like brand recognition.” Molly slid two laminated badges across the table. “We’re from Happy Hopper Mascot Maintenance. Sensible, boring, invisible. You carry the toolkit. I carry the gift log. If anyone stops us, we’re checking the rabbit costume for ventilation issues.” Bill read his badge. “Mascot Maintenance? Molly, this is heartbreak in lanyard form.” “It’s access.” He brightened suddenly. “What if we weaponize the cake?” “No.” “Just a little topple—” “No.” “A dramatic frosting event could create a corridor.” She leaned forward. “Listen carefully. No collapsing pastry, no smoke pellets, no trained pigeons, and absolutely no pretending to be a magician.” He looked wounded. “One time.” “One time you vanished our getaway keys.” A grin twitched at the corner of his mouth. “All right. Quiet and clean. In, lift, out.” “Exactly.” He drummed his fingers, then pointed to the map. “What about the clown?” Molly smiled, slow and foxlike. From her bag she produced a clown nose, red as a warning light. “According to our balloon artist, he’s vain, territorial, and cannot resist a professional critique.” Bill stared. “You’re going to distract a clown by insulting his balloon poodle technique?” “I’m going to devastate him.” For a moment Bill was silent, reverent. Then he raised his coffee in salute. “Molly,” he said, “sometimes I forget you’re the scariest person I know.” “Drink up,” she said, standing. “We’ve got a birthday to ruin politely.”
created_at
"2026-03-09T22:44:14.767Z"
3/9/2026, 10:44:14 PM
model
"openai/gpt-5.4-20260305"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1573
moderation_latency
(null)
generation_time
23952
tokens_prompt
77
tokens_completion
1147
native_tokens_prompt
66
native_tokens_completion
952
native_tokens_completion_images
0
native_tokens_reasoning
59
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.014445
router
(null)
provider_responses
0
endpoint_id
"9ff5625c-403f-4d7f-b895-58ac7295062c"
id
"resp_0b981406f90266890169af4d3ed8288197878ef4f915cce25a"
is_byok
false
latency
161
model_permaslug
"openai/gpt-5.4-20260305"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773096254-RltNb2CoFpXNLpGANESW"
upstream_id
"resp_0b981406f90266890169af4d3ed8288197878ef4f915cce25a"
total_cost
0.014445
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
650
94.37%
Dialogue to Total Word Ratio
Ratio: 48.92%, Deviation: 1.08%
neededClean
false
wordsTotal
650
wordsDialogue
318
47.1831%