NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-16-01KHK16NB5HBTY5X9EGM3VZFMC
Run Details
Model:
Qwen 3.5 Plus (2026-02-15)
Test:
Dialogue tags
Scenario:
Write 500 words with 70% dialogue
Duration:
36.0s
Cost:
$0.001942
Provider:
Alibaba
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
"Bill, stop chewing on the floor plan. It tastes like crayon and despair." "It's a wax-based architectural schematic, Molly. Very high tech. Besides, we need to know where the snack table is. I can't work on an empty stomach." "The snack table is irrelevant. The target is the gift pile. Specifically, the mountain of wrapped boxes under the 'Happy 7th Birthday' banner. We hit it, we grab the loot, we vanish before the clown finishes his balloon animal routine." "A clown? You didn't mention a clown. Clowns are unpredictable, Molly. They have tiny cars and hidden pockets. What if he sees us?" "He won't see us because we'll be wearing these." Molly tossed a pair of oversized, fuzzy animal ears at Bill. "We blend in as party guests. Parents never suspect other parents. Especially not parents wearing bunny ears." "These are kitten ears. And they itch. Why can't we just wear black like normal criminals?" "Because it's a children's party, Bill! Black stands out. Pastels are the new camouflage. Now, look at the timeline. The cake cutting happens at four. That's our window. While everyone is singing and the parents are fighting over who gets the first slice, we dive behind the sofa, roll the presents into our stroller, and exit stage left." "A stroller? You want me to push a stolen-goods stroller?" "It's a double stroller, Bill. High capacity. We load it up with LEGOs, video games, and whatever cash stuffs inside those greeting cards. Do you have any idea how much money people put in cards these days? It's basically an unregulated bank vault filled with twenty-dollar bills and checks made out to 'Tyler'." "Tyler has a nice ring to it. But what about the security? Surely a seven-year-old's birthday has tight security." "The security is a golden retriever named Barnaby and a teenage babysitter named Chloe who will be glued to her phone. Barnaby is easily distracted by squeaky toys. I've got three in my pocket. Chloe is easily distracted by anything that isn't her screen. We are ghosts, Bill. Ghosts in kitten ears." "I still don't like the ears. What if the clown tries to make us a balloon sword? I can't hold a loot bag and a balloon sword." "Then you drop the sword. Priorities, Bill. The loot comes first. Remember the last job? The charity gala? We walked out with three hundred dollars worth of silent auction baskets because you got distracted by the mini-quiches." "That quiche was exceptional, and you know it. But fine. Kitten ears it is. What's the exit strategy if Barnaby starts barking?" "Plan B. We unleash the ultimate weapon." Molly reached into her backpack and pulled out a large, plastic wand topped with a glittery star. "The Bubble Machine. We activate it, the kids go crazy chasing bubbles, chaos ensues, and we slip out the side door while the clown tries to pop them all." "You brought a bubble machine to a heist?" "It's a distraction device, Bill. Don't knock it until you've seen a hundred sugar-rushed kids chase soap spheres into a wall. It's beautiful chaos. Now, put on the ears. We rehearse the waddle. Parents don't walk; they waddle under the weight of societal expectations and oversized gift bags." "This is the dumbest plan we've ever had." "It's brilliant. Now waddle, Bill. Waddle like you haven't slept in three days and you're holding a lukewarm juice box." "I hate this. But I do love LEGOs." "That's the spirit. Let's go steal a childhood."
created_at
"2026-02-16T10:51:45.32132+00:00"
2/16/2026, 10:51:45 AM
model
"qwen/qwen3.5-plus-20260216"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
869
moderation_latency
(null)
generation_time
35963
tokens_prompt
77
tokens_completion
887
native_tokens_prompt
78
native_tokens_completion
796
native_tokens_completion_images
(null)
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0019416
router
(null)
provider_responses
0
id
"chatcmpl-2c1cb4e9-48c5-9fa5-a304-a30574bd9249"
status
200
is_byok
false
latency
869
endpoint_id
"71e0f6c3-7815-4734-92d3-8a04f3b0ca91"
provider_name
"Alibaba"
model_permaslug
"qwen/qwen3.5-plus-20260216"
api_type
"completions"
id
"gen-1771239069-3iXETvliRQUsZ8eag2ED"
upstream_id
"chatcmpl-2c1cb4e9-48c5-9fa5-a304-a30574bd9249"
total_cost
0.0019416
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Alibaba"
Evaluation details
Result
Evaluator
Details
Meta Data
0.0000%
Matches word count
n/a
neededClean
false
words
593
0.0000%
Dialogue to Total Word Ratio
Ratio: 95.28%, Deviation: 25.28%
neededClean
false
wordsTotal
593
wordsDialogue
565
0.0000%