NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-27-01KQ77ND2YFA478ZVER6WEED65
Run Details
Model:
Qwen 3.6 Flash
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
43.4s
Cost:
$0.012570
Provider:
Alibaba
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
6.21%
Dialogue to Total Word Ratio
(details)
3.1029%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The damp moss of the neighbor's yard seeped through Molly's tactical pants, ruining the silhouette of her stealth suit. She checked her watch, then the bouncy castle, then Bill, who was currently entangled in a roll of duct tape he insisted was a grappling hook prototype. "Bill, you are vibrating," Molly said, her voice low and strained. "Stop vibrating. You're going to set off the motion sensor alarm, and I don't want to explain to the birthday mother why her inflatable castle is shaking like a epileptic poodle." Bill finally extricated his left arm and grinned, revealing a molar chipped from a failed burglary at a bakery. "It's the anticipation, Molly! The adrenaline is flooding my system like cheap champagne. Besides, the Stare of Doom 3000 is prepped, charged, and ready to terrorize. Once I unleash its psychic wail of judgment, the adults will weep and offer up all their gifts without question." Molly reached into her utility belt and pulled out a bag of glitter bombs, adjusting the tension springs with the precision of a bomb defusal expert. The kitchen window pulsed with the warm, yellow glow of holiday cheer, a beacon of happiness that Molly had sworn to extinguish with the darkness of avarice. "Stare of Doom? Bill, that is a plush unicorn with a squeaker taped to its nose. Last week you tried to bluff your way past a golden retriever because you wore a paper bag on your head. The dog just licked the bag off your face while you were trying to mime being a statue." Molly rolled her eyes so hard she feared permanent detachment. "You look ridiculous. The plan relies on stealth, not cheap theater. We slip in when the cake is cut. The sugar rush blind spots are at three-minute intervals. We grab the loot, we leave no trace, and we exit through the doggy door which I hope you did not modify." Bill muttered, wrestling with the unicorn. "The bag-on-head incident was a misunderstanding of canine communication protocols. Anyway, the plan is solid. We grab the loot, we leave no trace, we exit through the doggy door which I have enlarged just enough for my hips." Molly checked the wind direction, ensuring the scent of stolen cake would blow away from their escape trajectory. She tested the grip of her gloves, flexing fingers hardened by years of prying open sealed boxes. Every muscle coiled tight, driven by a hunger that had nothing to do with food and everything to do with the thrill of defying the rigid sanctity of children's birthday rules. Bill's knees knocked, threatening to announce their presence, so Molly tapped his shin hard. "You enlarged the doggy door? Bill, that house has a Beagle named Mr. Wiggles who knows how to open latches. If you make the door bigger, Mr. Wiggles will follow the scent of stolen Legos straight to our hideout, and we don't have an escape route for a dog carrying a PS5." Bill paused, the unicorn hovering dangerously close to Molly's ear. "Mr. Wiggles is an asset, not a threat. He answers to cheese. I have cheese. This changes nothing. We proceed. The cake approaches in sixty seconds. The adults will turn to the dessert. Their attention span will drop to zero. It is our window, Molly. Strike while the frosting is fresh." Molly sighed, the sound escaping her like a deflating balloon. She looked at Bill, then at the unicorn, then at the house where children screamed with delight over unboxings that would inevitably fund her retirement. She adjusted her mask, felt the weight of the velvet-lined sacks, and nodded once. "Fine," she whispered. "But if Mr. Wiggles takes a bite out of your ankle, I'm not performing a field amputation. You're on your own against the Beagle. Now get into position. And for the love of profit, hide the squeaker until the moment of chaos." "Squeaker is silent," Bill lied, tapping the tape. "Squeaker is dead." "Squeaker is alive and judging me," Molly shot back. "Let him judge! Judgment builds character! And loot!"
created_at
"2026-04-27T10:27:21.317Z"
4/27/2026, 10:27:21 AM
model
"qwen/qwen3.6-flash"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
626
moderation_latency
(null)
generation_time
43271
tokens_prompt
77
tokens_completion
8126
native_tokens_prompt
76
native_tokens_completion
8367
native_tokens_completion_images
(null)
native_tokens_reasoning
7469
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0125695
router
(null)
provider_responses
0
endpoint_id
"2435bec9-c808-4f27-9fc8-12f3460d7b66"
id
"chatcmpl-7bc5b1cc-1ce5-9505-8cbf-c36f88ed1aa9"
is_byok
false
latency
626
model_permaslug
"qwen/qwen3.6-flash"
provider_name
"Alibaba"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777285641-l856dnV5BVrGjIsQyAzI"
session_id
(null)
api_type
"completions"
id
"gen-1777285641-7NHfx1Mn4fofmoXYNEXF"
upstream_id
"chatcmpl-7bc5b1cc-1ce5-9505-8cbf-c36f88ed1aa9"
total_cost
0.0125695
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Alibaba"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
680
6.21%
Dialogue to Total Word Ratio
Ratio: 57.46%, Deviation: 7.46%
neededClean
false
wordsTotal
684
wordsDialogue
393
3.1029%