NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-02-25-01KJBGHGMV5DC9B82SHMYTR2YX
Run Details
Model:
Qwen 3.5 122B
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
1m 17s
Cost:
$0.021275
Provider:
Alibaba
Evaluation summary
Score
Evaluator
Details Link
99.98%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
49.9920%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The white van smelled faintly of stale coffee and deep regret. Molly adjusted the tactical vest over her floral dress, checking the straps for the third time while staring at the suburban house across the street. Outside, the neighborhood buzzed with the chaotic, high-pitched energy of a six-year-old’s birthday party. Colorful balloons bobbed against the breeze like colorful hostages waiting for rescue. Bill sat in the passenger seat, nervously spinning a bag of stolen party favors between his fingers. He looked like a man walking into a lion’s den wearing a tutu, sweating profusely despite the cool evening air. We go in at three, Molly said, her voice tight with professional focus. When the parents are distracted by the piñata. The piñata is a trap, Bill whispered, eyes wide with genuine fear. It’s a sensory overload device designed to disorient professionals. It’s a cardboard donkey, Bill. Stop overthinking the mechanics. What about the cake? Is the cake a security measure? Molly sighed, rubbing her temples where a headache was blooming. The cake is the distraction. We hit the table. Grab the wrapped boxes. Leave the plastic toys. Focus on the electronics. And if the birthday boy cries? Then you distract him with a magic trick. You’re the face. I can’t do magic. I can barely tie a tie without looking like a noose. You’ll learn. Just smile. Molly grabbed the heavy duffel bag, checking the contents one last time. Remember, no running. We walk. We blend. We are just uncles who arrived late. I look like an uncle who sells insurance, Bill muttered, adjusting his collar. Perfect. Boring is safe. Molly killed the engine. The silence in the van was heavy, broken only by the distant sound of children screaming in joy. She looked at Bill, her expression softening just a fraction. You ready? Do I have to? Yes. Bill took a deep breath, gripping the door handle until his knuckles turned white. Okay. For the loot. For the loot, Molly confirmed. They stepped out into the sunlight, adjusting their sunglasses to hide the panic in their eyes. The party was in full swing. Confetti littered the grass like colorful confetti. Molly signaled the move. Bill followed, looking like a deer in headlights. They approached the house, the scent of sugar and sunscreen filling the air. Molly kicked the plan into gear. Remember, she whispered. No witnesses. What if they ask for a high five? Give them one. Then steal the remote. They reached the porch. Molly checked the door. Locked. Bill pulled a lockpick from his pocket. It was a plastic toy key. This won’t work, Molly hissed. It’s a master key, Bill lied. It’s from a toy store. Shh. I’m improvising. He jiggled it. The door clicked open. Molly grinned. Good job. I know. They slipped inside. The living room was a minefield of wrapping paper. Molly pointed to the pile. Bill nodded. They moved in sync. It was working. Until the dog barked. Plan B, Molly said. Run? Run.
created_at
"2026-02-25T23:00:59.681Z"
2/25/2026, 11:00:59 PM
model
"qwen/qwen3.5-122b-a10b-20260224"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
507
moderation_latency
(null)
generation_time
76523
tokens_prompt
77
tokens_completion
6724
native_tokens_prompt
76
native_tokens_completion
6639
native_tokens_completion_images
(null)
native_tokens_reasoning
5938
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0.0212752
router
(null)
provider_responses
0
endpoint_id
"33d466d1-d2f7-45ed-8ff3-fda7d2e32351"
id
"chatcmpl-64d5ed9d-007f-95d9-9bae-b2ff2173e101"
is_byok
false
latency
507
model_permaslug
"qwen/qwen3.5-122b-a10b-20260224"
provider_name
"Alibaba"
status
200
api_type
"completions"
id
"gen-1772060459-eaBpAFHD3GHBy33g5hFl"
upstream_id
"chatcmpl-64d5ed9d-007f-95d9-9bae-b2ff2173e101"
total_cost
0.0212752
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Alibaba"
Evaluation details
Result
Evaluator
Details
Meta Data
99.98%
Matches word count
n/a
neededClean
false
words
502
0.00%
Dialogue to Total Word Ratio
Ratio: 0.00%, Deviation: 30.00%
neededClean
false
wordsTotal
505
wordsDialogue
0
49.9920%