NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-02-25-01KJBKJ61RDDRWYP8P47NEA7QG
Run Details
Model:
Qwen 3.5 122B
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
49.3s
Cost:
$0.022213
Provider:
Alibaba
Evaluation summary
Score
Evaluator
Details Link
86.38%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
43.1902%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly tapped the laminated blueprints spread across the dashboard of the rusted van. The glow of the flickering streetlamp illuminated the crayon drawings of the backyard, marked with sticky notes. Dust motes danced in the light. She traced a line with her finger. The air smelled of stale coffee and old leather. You see this cluster? she asked, pointing at a red circle near the trampoline. Bill chewed a granola bar loudly, crumbs scattering on the upholstery. Those are the hazards? Those are the kids. They're unpredictable variables. I can handle unpredictable. I once stole a diamond from a poodle. This is a six-year-old's birthday, Bill. No poodles involved. Molly folded the paper sharply, creasing the corner. She looked at him intensely. We go in through the patio. The sliding door is unlocked. I checked the neighborhood watch forum. You stalked the party? I researched the target. The parents are drinking wine. They're completely distracted. Bill nodded slowly, wiping his mouth with a sleeve. And the loot? Everything. Legos, dolls, the big boxed bike. The bike is heavy. The bike is awkward. We have the dolly. The dolly is strong. Bill sighed, rubbing his temples. I hate the dolly. It squeaks. It's a tactical advantage. Noise discipline is key. Molly opened her glove box with a loud creak. She pulled out two fabric masks. Here are the masks. Bill took one, examining the peeling paint. Why are they clowns? It's a theme. Blends in. Also, it's terrifying. Terrifying to kids? To everyone. Especially me. What about the dogs? There's a Golden Retriever. Name is Buster. Buster is a good boy. He won't bark. He's a guard dog, Bill. He barks at mailmen. I'll bring a treat. A big one. You're bringing contraband into a heist? It's bribery. Standard procedure. Molly shook her head, her ponytail swinging. She gripped the steering wheel. Focus. We enter at 4:15 PM. Cake time. Cake time is the best time. Sugar coma. Exactly. We grab the presents, we leave the cake. What if they want cake? Let them eat cake. We want the Nintendo Switch. Bill looked at the Switch box on the passenger seat. That's the main objective? Obviously. The rest is filler. Filler is good. Filler pays the rent. Not if we get caught by a mother with a purse. I've seen you dodge a purse. You're a ninja. I'm a professional. There's a difference. Molly started the engine. The van rumbled to life, shaking the floorboards. The headlights cut through the darkness. The vibration traveled up her spine. Ready? As I'll ever be. Do we have the zip ties? We don't need zip ties. Then what are we? Thieves with a conscience. We leave the wrapping paper. That's generous. It's eco-friendly. Bill smiled, adjusting his clown nose. Let's go steal some childhoods. Let's go steal some presents. Molly put the car in gear. The tires crunched on gravel. She paused, hand hovering over the shifter. Wait. Did you check the alarm? There is no alarm. Are you sure? I'm sure. Okay. Let's roll.
created_at
"2026-02-25T23:53:47.325Z"
2/25/2026, 11:53:47 PM
model
"qwen/qwen3.5-122b-a10b-20260224"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
451
moderation_latency
(null)
generation_time
49183
tokens_prompt
77
tokens_completion
7709
native_tokens_prompt
76
native_tokens_completion
6932
native_tokens_completion_images
(null)
native_tokens_reasoning
6157
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0222128
router
(null)
provider_responses
0
endpoint_id
"33d466d1-d2f7-45ed-8ff3-fda7d2e32351"
id
"chatcmpl-698c24c4-18d9-995e-8cd8-f4f52c7a24ce"
is_byok
false
latency
451
model_permaslug
"qwen/qwen3.5-122b-a10b-20260224"
provider_name
"Alibaba"
status
200
api_type
"completions"
id
"gen-1772063627-gqzpciuXFojXuuzTa5i1"
upstream_id
"chatcmpl-698c24c4-18d9-995e-8cd8-f4f52c7a24ce"
total_cost
0.0222128
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Alibaba"
Evaluation details
Result
Evaluator
Details
Meta Data
86.38%
Matches word count
n/a
neededClean
false
words
511
0.00%
Dialogue to Total Word Ratio
Ratio: 0.00%, Deviation: 50.00%
neededClean
false
wordsTotal
514
wordsDialogue
0
43.1902%