NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-16-01KHK2H81RPYDXGV2TWZEW0WHB
Run Details
Model:
Qwen 3.5 Plus (2026-02-15)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
29.5s
Cost:
$0.001951
Provider:
Alibaba
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
"We can't rob a birthday party, Bill. It's morally bankrupt even for us." "Morally bankrupt? Molly, look at the price of helium these days. We are drowning in debt. This is survival." "It's a six-year-old's unicorn theme, Bill. There will be glitter. Everywhere. Glitter is the asbestos of the craft world. It never leaves your lungs." "Think of the loot. Wrapped boxes, Molly. Dozens of them. Legos, tablets, those expensive squishy toys that cost more than my first car. It's low security. High reward. The parents will be distracted by cheap wine and trying to keep toddlers from eating the centerpieces." "And the security? You think there isn't a bouncer at a rich kid's bash?" "The security is a guy named Gary who moonlights as a clown. I checked. He's currently suing the agency for emotional damages regarding balloon animals. He won't be working the door." "A clown? Perfect. Just what I needed to add to my night terrors." "Stop being dramatic. We go in during the cake cutting. Everyone sings, eyes close, hands clap. That is our window. Forty-five seconds of collective blindness." "Forty-five seconds to grab thirty gifts? My back can't handle that kind of lifting, Bill. I threw out my disc lifting a sack of flour last week." "We don't lift. We slide. I've rigged a chute from the second-story bathroom window down to the getaway van. We dump the presents down the chute, zip line down after them, and we're gone before 'Happy Birthday' finishes the second verse." "You built a zip line? For a heist at a preschooler's party?" "Efficiency, Molly. Also, I found a bulk order of inflatable T-Rex costumes on sale. Perfect disguises. Who suspects a dinosaur?" "Everyone suspects a dinosaur, Bill. That is the most suspicious thing possible. If I see a T-Rex at my kid's party, I'm calling the National Guard." "Fine, no dinosaurs. We go with the magicians' assistant route. Sequins and top hats. Classic, invisible." "I am not wearing sequins. I itch just thinking about it." "Then wear black. Ninja style. But remember, no stealth kills. These are parents with strollers, not drug lords." "Obviously. But what if a kid sees us? What do we say? 'Sorry, little Timmy, just borrowing your Nintendo Switch for a few years'?" "We tell them we're part of the entertainment. An interactive treasure hunt. 'Find the nice thieves who take your gifts!' Kids love games. They'll help us load the van." "That is the stupidest plan I have ever heard, and we once tried to steal a statue by dressing up as pigeons." "The pigeon thing worked until you sneezed." "I had an allergy to birdseed! Look, if we do this, I get fifty percent, and I get to pick the getaway vehicle. No vans with 'Clown College' written on the side." "Deal. But if we get caught, I'm telling the police you forced me. I'll say you have a pathological hatred of birthdays." "Fair enough. When do we move?" "Saturday at two. Bring a big sack and maybe some hand sanitizer. That place is going to be a germ factory." "Great. Stealing from children and catching plague. My career has really peaked." "Cheer up, Molly. Think of the Legos. We could build a fortress. A fortress of stolen dreams." "I hate you, Bill." "I know. That's why we make such a good team. Now, let's go buy some black turtlenecks."
created_at
"2026-02-16T11:14:54.521037+00:00"
2/16/2026, 11:14:54 AM
model
"qwen/qwen3.5-plus-20260216"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
820
moderation_latency
(null)
generation_time
29443
tokens_prompt
77
tokens_completion
866
native_tokens_prompt
78
native_tokens_completion
800
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0019512
router
(null)
provider_responses
0
id
"chatcmpl-fd642781-c89a-99e2-b166-eaa7432357da"
status
200
is_byok
false
latency
820
endpoint_id
"71e0f6c3-7815-4734-92d3-8a04f3b0ca91"
provider_name
"Alibaba"
model_permaslug
"qwen/qwen3.5-plus-20260216"
api_type
"completions"
id
"gen-1771240464-TvPRB5jPc3eHIlGfxQiH"
upstream_id
"chatcmpl-fd642781-c89a-99e2-b166-eaa7432357da"
total_cost
0.0019512
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Alibaba"
Evaluation details
Result
Evaluator
Details
Meta Data
0.0000%
Matches word count
n/a
neededClean
false
words
573
0.0000%
Dialogue to Total Word Ratio
Ratio: 100.00%, Deviation: 30.00%
neededClean
false
wordsTotal
573
wordsDialogue
573
0.0000%