NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-27-01KQ6Z4TCQF3SVND1MKTFBJ1VE
Run Details
Model:
Qwen 3.5 Plus (2026-04-20)
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
2m 5s
Cost:
$0.019890
Provider:
Alibaba
Evaluation summary
Score
Evaluator
Details Link
68.10%
Matches word count
(details)
72.32%
Dialogue to Total Word Ratio
(details)
70.2114%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
"Perimeter is hot. Three golden retrievers and a grandmother with a tanning addiction." Bill tapped the crayon map with a lukewarm juice box, his knuckles white as he tried to suppress a giggle at his own reflection in the waxed paper window of the snack bag wrapped around the cardboard. "Retrievers are friendly, but the grandmother uses a laser pointer to guide the snacks. We cannot slip past the laser." Molly adjusted her tactical visor, which was really just a pair of dinosaur sunglasses duct-taped to a bicycle helmet, the elastic strap digging ruthlessly into her hairline while she tried to ignore the sticky residue of a spilled smoothie on her shoulder. "Grandma units are volatile. If she suspects a party crasher, she triggers the timeout protocol. Screaming. Nap time. The whole block goes dark." "I have glitter glue. Blinds them for three hours." "Too risky. Glitter is the herpes of craft supplies. It gets on our fingerprints. The police will trace us by the sheer sparkliness of our DNA. We need a cleaner approach." "You're overthinking this. Look at the loot pile." Bill pointed to a crudely drawn rectangle labeled 'Toys', his finger smudging the purple marker ink as he traced the path toward the bouncy castle. "We go in as entertainment. You're the magician. I'm the balloon artist." "No balloon poodles. Bad for the wrists." "Swords then. Seventy-thirty Lego." "Sixty-forty and I get mystery boxes." "Deal. Gate bell squeals." Molly nodded slowly, her eyes drifting to the rearview mirror where a plastic mask of a superhero hung by its cape, her hand resting on the ignition key that was actually a plastic ring pull from a soda can. "Sock on the bell. Exit during cake smash. Parents panic, we vanish." "Secondary?" "Grape juice boxes. Playground cash." "Take cupcake tin. Hostages get gluten-free, crew gets sugar." "Fine." Bill sighed, reaching into the back seat to retrieve the bundle of fabric that resembled a diaper, the modified pouch system promising enough capacity to hold a staggering inventory of plastic junk and high-calorie contraband. "At least I don't have to wear the nose." Molly patted the bulbous red appendage clipped to her belt loop, the foam sphere twitching slightly with the vibration of the van's idle engine. "The nose stays. Psychological dominance. They never suspect the clown with the intense stare. Now check the duffel straps." Bill yanked on the webbing, testing the tension while Molly began to hum the theme song from a popular animated franchise, a low vibration of nervous anticipation filling the cramped cabin. "We also need to account for the twins. They have sticky hands." "I have hand sanitizer. High grade. Military surplus." "Good. We cannot get caught with sticky fingers. It slows down the grab. Straps are solid. Let's grab the cupcakes and go." Molly grinned, the expression cracking the layer of greasepaint she'd already applied to her cheeks. "Unseen threat. Roll."
created_at
"2026-04-27T07:58:29.277Z"
4/27/2026, 7:58:29 AM
model
"qwen/qwen3.5-plus-20260420"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
652
moderation_latency
(null)
generation_time
124738
tokens_prompt
77
tokens_completion
8256
native_tokens_prompt
76
native_tokens_completion
8275
native_tokens_completion_images
(null)
native_tokens_reasoning
7600
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0198904
router
(null)
provider_responses
0
endpoint_id
"8b4a0fbb-6705-4f8d-a7f1-ad18ecc0edbc"
id
"chatcmpl-ff3dc5e6-5534-9819-aea2-da0adadecc53"
is_byok
false
latency
652
model_permaslug
"qwen/qwen3.5-plus-20260420"
provider_name
"Alibaba"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777276709-eqiMhyDQ2r6QzoljwiKo"
session_id
(null)
api_type
"completions"
id
"gen-1777276709-ayzrKl05NfZIwnw7RyXc"
upstream_id
"chatcmpl-ff3dc5e6-5534-9819-aea2-da0adadecc53"
total_cost
0.0198904
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Alibaba"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
68.10%
Matches word count
n/a
neededClean
false
words
486
72.32%
Dialogue to Total Word Ratio
Ratio: 52.55%, Deviation: 2.55%
neededClean
false
wordsTotal
491
wordsDialogue
258
70.2114%