NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-09-01KKAC96J1GRCX4757H337BTCB
Run Details
Model:
GPT-5.4 (Reasoning, Low)
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
31.3s
Cost:
$0.017730
Provider:
OpenAI
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.27%
Dialogue to Total Word Ratio
(details)
0.1344%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly and Bill crouched behind the garden shed like two professionals in a field that had absolutely no professional standards. Beyond the fence, a suburban lawn blazed with balloons, bunting, and the shrill confidence of six-year-olds who had never once considered consequences. A clown on stilts drifted past the hedge like a brightly colored nightmare. Somewhere near the patio, a bubble machine wheezed with the desperation of an aging accomplice. Bill adjusted his flat cap and peered through a knot-hole in the fence. “I’m just saying, this is low even for us.” “It’s not low,” said Molly, unfolding a hand-drawn map on the lid of a biscuit tin. “It’s efficient. Tiny people receive luxury goods in volume. Tiny people lack object permanence. This is practically victimless.” “They’ll notice the bike’s gone.” “Only if we take the bike.” “We are taking the bike.” Molly tapped the map. “Focus. Present table by the trampoline. Escape route through side gate, across the alley, into Mrs. Greeley’s compost corridor.” “Compost corridor sounds less like an escape route and more like a medieval punishment.” “It’s cover. No one willingly follows a thief into fermented cabbage.” Bill looked again. “There’s a magician now. That complicates things.” “Why?” “They’re territorial.” On the lawn, children screamed as a woman dressed as a fairy supervised a sack race with the steely smile of someone one inconvenience away from a headline. The presents sat under a striped gazebo: towers of glossy bags, boxes with curled ribbons, one enormous parcel wrapped in dinosaur paper. Bill whistled softly. “That’s a respectable haul.” Molly grinned. “Exactly. We go in disguised.” “As what?” “As entertainers.” Bill turned to her slowly. “Molly, last time you said that, I spent forty minutes making balloon animals for a christening and got bitten by a swan.” “You made one balloon ferret and three things that insulted God. This is different.” From the tin she produced two costumes: a pirate hat with sequins glued on badly and a dragon suit missing one eye. Bill stared. “I’m not wearing the dragon.” “You’re wearing the dragon.” “Why can’t I be the pirate?” “Because you have the face of a dragon who owes money.” He sighed and held up the costume. “Children can smell fear.” “Then marinate.” Bill wriggled into the green fabric while Molly pinned on the pirate hat and drew a moustache on her finger to compare options. “Remember,” she said, “confidence. We belong here.” “No one belongs here,” Bill muttered. “There’s a chocolate fountain next to a bouncy castle. This is what happens when adults stop saying no.” They slipped through the side gate and onto the lawn just as the clown folded himself into a deck chair like a tired spider. A little boy spotted Bill and gasped. “Dragon!” he shouted. Bill spread his arms. “Rawr,” he said, with all the menace of a man asking for directions. The children swarmed him at once. “Excellent,” Molly murmured. “Distraction phase.” “This wasn’t the plan,” Bill hissed through the dragon’s mesh mouth. “It is now.” A girl in a tiara tugged Molly’s sleeve. “Are you the treasure pirate?” Molly dipped into a bow. “Aye. Captain Molly Cutlass, seeker of hidden riches.” The girl pointed to the gazebo. “Treasure’s over there.” Molly blinked. “Children are idiots,” she whispered. “Useful idiots,” Bill said, as three boys attempted to ride him. Molly steered toward the presents with theatrical swagger. “Avast,” she cried. “By ancient party law, all gifts must be inspected for curse—” A mother stepped into her path, holding a tray of mini pizzas. “Oh good, more entertainment. Can you start the treasure hunt?” Molly smiled the smile of a woman balancing disaster on a teaspoon. “Absolutely. We were just… assessing treasure density.” Bill stumbled over, dragon tail caught under a scooter. “I found the loot,” he said. “Wonderful,” said the mother. “Hide the prizes around the garden.” Molly and Bill stared at her. “Prizes?” Bill said. She thrust a cardboard box into his claws. Inside were envelopes, toy cars, stickers, and what looked suspiciously like cash folded into a card. Molly recovered first. “Of course. Standard dragon-pirate protocol.” Ten frantic minutes later they were sneaking around bushes, burying party favors beneath flowerpots and behind lawn ornaments while children followed clues Molly invented in a panic. “Left of the gnome!” she cried. “Under the seat of the chair of destiny!” Bill crouched behind the barbecue and found three gift cards in a birthday card shaped like a monster truck. “Molly,” he whispered, “I think the real treasure was inside the treasure.” She slid beside him, eyes lighting up. “Selective theft. Elegant.” “Can we leave now?” A roar went up from the lawn. The birthday boy had opened the enormous dinosaur parcel and discovered a battery-powered jeep. Bill looked at the side gate, then at the jeep. “Now I know you said no bike.” Molly tucked the gift cards into her sleeve. “That,” she said, “is not a bike.” They grinned at each other, already revising the plan.
created_at
"2026-03-09T22:43:00.302Z"
3/9/2026, 10:43:00 PM
model
"openai/gpt-5.4-20260305"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
3829
moderation_latency
(null)
generation_time
31167
tokens_prompt
77
tokens_completion
1589
native_tokens_prompt
66
native_tokens_completion
1171
native_tokens_completion_images
0
native_tokens_reasoning
32
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.01773
router
(null)
provider_responses
0
endpoint_id
"9ff5625c-403f-4d7f-b895-58ac7295062c"
id
"resp_059195fc32d9cb730169af4cf468048190a83a212b5be4e5fa"
is_byok
false
latency
235
model_permaslug
"openai/gpt-5.4-20260305"
provider_name
"OpenAI"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773096180-U40Cit4X38inAmQGaPNx"
upstream_id
"resp_059195fc32d9cb730169af4cf468048190a83a212b5be4e5fa"
total_cost
0.01773
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
835
0.27%
Dialogue to Total Word Ratio
Ratio: 39.12%, Deviation: 10.88%
neededClean
false
wordsTotal
841
wordsDialogue
329
0.1344%