NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-02-25-01KJBHED7SBZ3QNNMD37TRTR5D
Run Details
Model:
Qwen 3.5 122B
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
1m 26s
Cost:
$0.023061
Provider:
Alibaba
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly stared at the blueprints spread across the rusted hood of the old van. It wasn't a bank vault or a diamond exchange. It was a suburban bouncy castle. Bill leaned over, crumbs falling from his sandwich onto the schematic. He chewed thoughtfully, eyes scanning the perimeter. The engine ticked as it cooled, a rhythmic sound against the quiet street. The van smelled of stale coffee and old grease. Molly sighed, rubbing her temples. She looked at the watch on her wrist. The second hand ticked loudly in the silence. Bill frowned, looking at the photo of a smiling toddler. Stealing from children felt wrong, even for them. But rent was due. They loaded the gear. A net gun, a grappling hook, and a bag full of glitter bombs. Molly checked her watch again. They climbed out of the van. The house was decorated with streamers and happy faces. Molly pulled a mask over her face. It was a clown mask. They crept toward the front door. Molly picked the lock with a hairpin. It clicked open. Inside, the smell of frosting hung heavy in the air. A group of toddlers sat on the floor, staring at wrapped boxes. The room was lit by dim, colorful lights. Molly grabbed the first box. It was light. She tossed it to Bill. He caught it and stuffed it into the bag. A parent walked into the room. They froze. The parent smiled. She shoved a bag of stolen candy into the parent’s hands. The parent beamed. They slipped out the back door. They drove away into the night, the bag of presents heavy in the back. The security is tight, Bill said, pointing at a cluster of balloons. Those latex spheres are rigged with pressure sensors. They’re helium, Bill. They float. Exactly. If one pops, the whole alarm system goes off. We need to move fast. The piñata drop is at three. That’s when the distraction happens. What about the parents? They’re the real guards. I’ve disabled their phones. They’ll be too busy posting selfies to notice the loot vanishing. And the kids? They’re the payload. We have ten minutes before the cake arrives. If the cake arrives, the game is over. Because once the sugar hits, nobody leaves. Look at that, Bill whispered. A PS5 and a bike. Don’t get attached. Oh, you’re here early for the entertainment? We brought... presents. That was close, Bill said. Too close. Next time, we hit a wedding. Weddings have better loot.
created_at
"2026-02-25T23:16:46.462Z"
2/25/2026, 11:16:46 PM
model
"qwen/qwen3.5-122b-a10b-20260224"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
438
moderation_latency
(null)
generation_time
85940
tokens_prompt
77
tokens_completion
7540
native_tokens_prompt
76
native_tokens_completion
7197
native_tokens_completion_images
(null)
native_tokens_reasoning
6630
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0230608
router
(null)
provider_responses
0
endpoint_id
"33d466d1-d2f7-45ed-8ff3-fda7d2e32351"
id
"chatcmpl-ed36a740-34ca-9579-b964-068fa896c001"
is_byok
false
latency
438
model_permaslug
"qwen/qwen3.5-122b-a10b-20260224"
provider_name
"Alibaba"
status
200
api_type
"completions"
id
"gen-1772061406-pkofDpFGsdvVVsst5Jw3"
upstream_id
"chatcmpl-ed36a740-34ca-9579-b964-068fa896c001"
total_cost
0.0230608
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Alibaba"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
417
0.00%
Dialogue to Total Word Ratio
Ratio: 0.00%, Deviation: 30.00%
neededClean
false
wordsTotal
417
wordsDialogue
0
0.0000%