NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-02-25-01KJBEVKYS7S81M9DH7MQFW2TF
Run Details
Model:
Qwen 3.5 122B
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
2m 20s
Cost:
$0.057054
Provider:
Alibaba
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
82.78%
Dialogue to Total Word Ratio
(details)
41.3875%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Molly adjusted the black ski mask, which was slipping over her eyes. She kicked the side of the rusted van. The metal groaned in protest. “This is it, Bill. The Johnson residence.” Bill checked his watch, the digital display glowing green in the shadows. “It’s a third birthday party, Mol. The security is basically a plastic gate.” “Don’t underestimate the parents,” Molly said, tapping the crumpled blueprint against her chest. “They have juice boxes. Sticky traps. They will know if a wrapping paper is missing.” Bill sighed, the sound heavy with exhaustion. “We are professionals. We don’t steal from kids.” “We steal from parents,” she corrected, pointing at the pile of wrapped boxes visible through the window. “Those are the real victims. They bought the toys. We are redistributing wealth.” Molly explained that swapping gifts was better than theft. Bill argued that value was in the toys. Molly insisted on sentimental wrapping paper. “It’s sentimental value,” Molly snapped. “Now move. The cake is being cut in ten minutes. That’s when the distraction hits.” Bill looked at the house, the lights casting a warm glow over the lawn. Molly explained the distraction involved a clown. Bill hated the idea. “A fake clown. He’s carrying a fire extinguisher.” “A fire extinguisher?” “It’s for the smoke. We need chaos.” Bill rubbed his temples, feeling a headache forming behind his eyes. “This is the worst heist ever.” “It’s the most profitable,” Molly said, opening the van door. The hinges squealed. “Get in the tree now.” Bill climbed the ladder, his breath hitching in his throat. Molly watched him from the shadows, checking her watch again. He gripped the branch tightly while the fake clown arrived on a unicycle, carrying a fire extinguisher. Bill peeked down, his heart pounding as the clown waved. Molly signaled him to jump into the cake. Bill dropped from the tree, landing softly on the grass. He grabbed the pile of presents and ran quickly toward the van. The clown honked his horn, drawing attention from the partygoers. The presents were heavy. They laughed as they sped away. Molly drove the van into the night, leaving the chaos behind them.
created_at
"2026-02-25T22:31:33.598Z"
2/25/2026, 10:31:33 PM
model
"qwen/qwen3.5-122b-a10b-20260224"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
442
moderation_latency
(null)
generation_time
139575
tokens_prompt
77
tokens_completion
17571
native_tokens_prompt
76
native_tokens_completion
17820
native_tokens_completion_images
(null)
native_tokens_reasoning
17334
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0570544
router
(null)
provider_responses
0
endpoint_id
"33d466d1-d2f7-45ed-8ff3-fda7d2e32351"
id
"chatcmpl-6842432e-4a66-9f23-bbab-34787aea9d1c"
is_byok
false
latency
442
model_permaslug
"qwen/qwen3.5-122b-a10b-20260224"
provider_name
"Alibaba"
status
200
api_type
"completions"
id
"gen-1772058693-irgcM16a2Lbv01KzXahS"
upstream_id
"chatcmpl-6842432e-4a66-9f23-bbab-34787aea9d1c"
total_cost
0.0570544
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Alibaba"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
360
82.78%
Dialogue to Total Word Ratio
Ratio: 31.94%, Deviation: 1.94%
neededClean
false
wordsTotal
360
wordsDialogue
115
41.3875%