NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-27-01KQ6ZHMAXX0ZM0RRAQFK8KXGF
Run Details
Model:
Qwen 3.5 Plus (2026-04-20)
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
1m 50s
Cost:
$0.016910
Provider:
Alibaba
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
14.96%
Dialogue to Total Word Ratio
(details)
7.4786%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly adjusted her fake mustache, which was peeling off due to the humidity and her own nervous sweat. She tapped the blueprints of the backyard fortress known as Timmy's Sixth Birthday Blowout spread across the hood of their rusted sedan. Bill stood nearby, juggling rubber chickens he intended to use as distractions, though one had already squeaked and alerted a passing squirrel. The target was clear: a mountain of shiny boxes wrapped in excessive layers of adhesive tape, located near the bouncy castle. The stakes had never been lower, nor had their professional reputation. They were the best at stealing grand pianos, but debt collectors wanted cash, and a six-year-old's loot was the only currency the loan shark would take. A passing squirrel chattered accusations from the safety of a drainpipe, judging their life choices. Molly checked her belt. It was laden with grappling hooks made of bent coat hangers and smoke bombs that smelled like cinnamon rolls. Bill was struggling to unzip a duffel bag stuffed with stolen circus clowns' shoes, the intended camouflage for their escape. Sunset cast shadows over the suburban lawn. Laughter bubbled through the air like boiling sugar. In the old days, they would have been casing a diamond exchange, not a yard decorated with streamers that looked suspiciously like tripwires. Bill finally got the bag open, and a squeaky horn rolled out, emitting a sad trombone sound. He kicked the horn under the bumper and looked at Molly with wide, panicked eyes. She patted his shoulder, leaving a smear of grease on his neon green vest. "Bill, focus. The inflatable dragon is the primary obstacle. How do we deflate it without triggering the auditory alarm?" "It's a dragon, Molly. It screams when you pop it. That's the fun part." "Not for us. We need silence. I'm thinking a stealth puncture using the retractable needle hidden in your tie." "My tie is a bowtie. And it's covered in ducks. You want me to stab the dragon with a weaponized waterfowl accessory?" "Yes! Think of the loot, Bill. A half-eaten bag of gummy worms and a plastic watch that plays a jingle. We can fence the watch to a blind man." "Wait. Did you bring the gift tags?" "Yes, I have tags that say Property of the Government and ones that say Sorry. More convincing?" "The government ones. Kids are terrified of bureaucracy. Now, are we synchronized? My smoke bomb goes off, the parents sneeze, we grab the goods, and you juggle the children into a corner while I hit the bouncy castle?" "And if the juggling fails? I miss the Louvre. At the Louvre, nobody threw a juice box at my head." "The Louvre didn't have a mom with a spatula guarding the cake table. This is Tier One security. Psychological warfare. We hit the presents, swap them for IOUs, and vanish before Timmy realizes his Lego Death Star is missing." "The swap plan assumes the kids won't notice until we're gone. If Timmy wakes up early, we're trapped in a room full of hysterical minors who think we're magic." "We'll be ghosts. Phantom thieves of the party favor industry. Get your gear. And lose the duck tie. We're going in as the entertainment. I've already booked us as The Amazing Accordion Brothers." "Good enough. Drop the chicken, Bill. This is a heist, not a petting zoo incident. Let's roll. If you eat the cake before we extract the assets, I'm feeding you to the dragon."
created_at
"2026-04-27T08:05:32.538Z"
4/27/2026, 8:05:32 AM
model
"qwen/qwen3.5-plus-20260420"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
708
moderation_latency
(null)
generation_time
105846
tokens_prompt
77
tokens_completion
7022
native_tokens_prompt
76
native_tokens_completion
7033
native_tokens_completion_images
(null)
native_tokens_reasoning
6270
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0169096
router
(null)
provider_responses
0
endpoint_id
"8b4a0fbb-6705-4f8d-a7f1-ad18ecc0edbc"
id
"chatcmpl-bb5e816c-8120-9e54-9484-efaa5cd88c6c"
is_byok
false
latency
708
model_permaslug
"qwen/qwen3.5-plus-20260420"
provider_name
"Alibaba"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777277132-1MO7MeJjQAMJxmbuSQHF"
session_id
(null)
api_type
"completions"
id
"gen-1777277132-QHwHM9nBssjYeB4kJSdL"
upstream_id
"chatcmpl-bb5e816c-8120-9e54-9484-efaa5cd88c6c"
total_cost
0.0169096
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Alibaba"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
581
14.96%
Dialogue to Total Word Ratio
Ratio: 56.16%, Deviation: 6.16%
neededClean
false
wordsTotal
584
wordsDialogue
328
7.4786%