NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-04-27-01KQ6ZSAS9FSZ7P96CQKPT1JSV
Run Details
Model:
Qwen 3.5 Plus (2026-04-20)
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
2m 52s
Cost:
$0.027376
Provider:
Alibaba
Evaluation summary
Score
Evaluator
Details Link
0.08%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0424%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Bill adjusted his fake mustache, which was peeling off his lip due to the excessive sweat of anxiety. "Tell me again why the juice boxes are wrapped in plastic that screams 'Do Not Eat'?" He gestured wildly at the map sprawled across the folding table, which was currently holding down a pile of stolen balloons, deflating like their spirits. "Focus, William. The juice boxes are collateral. We're here for the loot." Molly sharpened a neon-green crayon that she intended to use as a lockpick. "The backyard gate locks with a mechanism only a toddler or a mastermind can defeat." "I used to be a mastermind!" Bill slammed his fist on the table. "Now I'm asking if we need a waiver to steal a piƱata full of dollar-store candy. This feels unethical, Mol. What if a kid cries? What if I have to look them in the eye?" "Ethics don't pay for the dental plan for our imaginary dental plan. Listen to the plan. You create a distraction by pretending to be a magical unicorn wizard who has lost his wings. I slip in through the bouncy castle ventilation shaft. I've already memorized the bounce patterns." "The ventilation shaft of a bouncy castle?" Bill whispered, checking his watch. "That thing is made of nylon and dreams. One wrong move and I'll bounce directly into the stratosphere. Can't I just knock over the punch bowl?" "Too messy. We need precision. We need agility. We need you to wear the antlers." Molly shoved a pair of glittery headbands toward him. Bill recoiled. "I am wearing the trench coat. The trench coat says 'Professional Thief.' The antlers say 'I enjoy sparkles and regret.'" "Put on the antlers, Bill. Or I tell the guild about the incident with the golden hamster." "Fine. Fine! The antlers go on." Bill slumped into his chair, the glitter headband slipping over his eyes. "What's the exit strategy? Do we run? Do we fly? Do we dissolve?" "We exit during the cake song. Everyone sings. Adults sing off-key. Babies scream. That's our window. We grab the big box with the bicycle symbol, the small box with the robot symbol, and the medium box that feels suspiciously like a pony." "A real pony?" Bill perked up, adjusting the headband. "Can we sell a pony? I know a guy who buys stolen garden gnomes. He has a trailer and a blindfold." "It's a metaphor for a large gift, you donut. We grab the targets and vanish. We leave an IOU written in finger paint. The parents will be baffled for weeks. They'll blame the squirrel. Or the wind. We're ghosts, Bill. Ghosts with glitter." Bill stood up, practicing his unicorn stance. "What if the dog bites me?" "Then you apologize and offer it a stolen cookie. Now grab the sack. The one that looks suspiciously like Santa's but says 'Party City' on the tag." "Party City..." Bill groaned. "This is our life. We're going to jail for a plastic shovel and a half-used slime kit. My grandma has better loot." "It's the principle, Bill. The principle of Christmas in July." Molly kicked the door open. "Let's ride. Or bounce. Mostly bounce."
created_at
"2026-04-27T08:09:41.424Z"
4/27/2026, 8:09:41 AM
model
"qwen/qwen3.5-plus-20260420"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
717
moderation_latency
(null)
generation_time
171608
tokens_prompt
77
tokens_completion
10748
native_tokens_prompt
76
native_tokens_completion
11394
native_tokens_completion_images
(null)
native_tokens_reasoning
10668
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
num_fetches
(null)
web_search_engine
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.027376
router
(null)
provider_responses
0
endpoint_id
"8b4a0fbb-6705-4f8d-a7f1-ad18ecc0edbc"
id
"chatcmpl-4641dc51-73a5-95ca-b410-219b3dd97efd"
is_byok
false
latency
717
model_permaslug
"qwen/qwen3.5-plus-20260420"
provider_name
"Alibaba"
status
200
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.1; linux; x64))"
http_referer
(null)
request_id
"req-1777277381-Y88TVSrKl1mb81RcUnYx"
session_id
(null)
api_type
"completions"
id
"gen-1777277381-bzsD8s3yUke0HFZra3vK"
upstream_id
"chatcmpl-4641dc51-73a5-95ca-b410-219b3dd97efd"
total_cost
0.027376
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Alibaba"
response_cache_source_id
(null)
Evaluation details
Result
Evaluator
Details
Meta Data
0.08%
Matches word count
n/a
neededClean
false
words
529
0.00%
Dialogue to Total Word Ratio
Ratio: 78.99%, Deviation: 28.99%
neededClean
false
wordsTotal
533
wordsDialogue
421
0.0424%