NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-27-01KQ77KT3SY1F5RZTTJQNXN7K0
Run Details
Model:
Qwen3.6 Max Preview
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
4m 9s
Cost:
$0.065580
Provider:
Alibaba
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
"Check the blueprints," Molly whispered, slapping a crayon drawing onto the table. Molly adjusted her tactical eyepatch, adorned with a stubborn glittery star sticker. She traced a grease-stained finger over the waxy schematic, brow furrowed like a general besieging a sandwich factory, while Bill meticulously sharpened a candy cane to a lethal point. Bill squinted. "Is that a bouncy castle or a tactical inflation device?" "It's the perimeter defense, Bill. Little Timmy's sixth birthday is a fortress. We go in during the cake distraction." "Cake distraction?" "Sugar crash window is four minutes. Parents wipe frosting off the dog. We hit the gift table." "The gift table is guarded." "By whom? The clown?" "Worse. Grandma. She's got eyes like a hawk and a purse full of hard candies she throws like shrapnel." Molly groaned. "I hate the elderly demographic. Unpredictable reach. What about the piƱata?" "Decoy. Filled with organic raisins. The kids will riot, drawing security away from the LEGO Death Star." "The Death Star is the package?" "Four thousand pieces. Street value is astronomical. Plus, a remote-controlled dinosaur that bites." "I want the dinosaur." "Focus, Molly! We're professionals. We don't keep the loot. We fence the loot." "I can fence a biting dinosaur to my nephew." "Your nephew is three. He'll lose a finger." "He has nine left. Risk assessment says go. What's the extraction route?" "The ball pit." "The ball pit? That's a biohazard zone. Last time I dove in, I found a half-eaten juice box and a lost soul." "Connects to the laundry chute. We slide out, land in the minivan, and vanish before the magician pulls the rabbit." "Wait. The magician. Is he union?" "Freelance. But he's got a wand that shoots sparks. Could be a concealed weapon." "Great. Grandma's candy shrapnel, a raisin riot, a spark-wielding wizard, and a biohazard extraction. All for plastic bricks and a biting lizard." "And the goody bags." "The goody bags?" "Filled with glow sticks and temporary tattoos. High margin, low weight." "You had me at glow sticks. I love glowing in the dark. Makes me feel stealthy." "You glow, you're not stealthy, Molly." "Details. What's the dress code?" "We're the entertainment. I'm booking us as 'The Balloon Twisters of Doom.'" "I can't twist balloons. I only twist ankles." "You'll learn. Pop them aggressively. Kids love destruction. Just don't pop the bouncy castle. That's our cover." "If things go south, I'm using the dinosaur as a shield." "Agreed. Gear up. Whoopee cushions for silent movement and a juice box garrote." "Juice box garrote? Really?" "Apple flavor. It stings the eyes. Let's roll. Little Timmy won't know what hit him." "Timmy's going to cry." "Timmy's getting a life lesson about asset security. Now, practice your balloon dog. You look constipated."
created_at
"2026-04-27T10:26:29.119Z"
4/27/2026, 10:26:29 AM
model
"qwen/qwen3.6-max-preview-20260420"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1349
moderation_latency
(null)
generation_time
248919
tokens_prompt
77
tokens_completion
8214
native_tokens_prompt
76
native_tokens_completion
8395
native_tokens_completion_images
(null)
native_tokens_reasoning
7679
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0655798
router
(null)
provider_responses
0
endpoint_id
"ca7fc0c7-f23d-4ac5-b2e0-945be5b61787"
id
"chatcmpl-96f3837d-cbc3-9642-b548-0b9d5e778dd7"
is_byok
false
latency
1349
model_permaslug
"qwen/qwen3.6-max-preview-20260420"
provider_name
"Alibaba"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777285589-dv8vjdu56MxwAUMdJrGX"
session_id
(null)
api_type
"completions"
id
"gen-1777285589-Nq4eOS06T16612f9olIK"
upstream_id
"chatcmpl-96f3837d-cbc3-9642-b548-0b9d5e778dd7"
total_cost
0.0655798
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Alibaba"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
458
0.00%
Dialogue to Total Word Ratio
Ratio: 88.10%, Deviation: 18.10%
neededClean
false
wordsTotal
462
wordsDialogue
407
0.0000%