Qwen
Comparing 10 models from Qwen.
| Model | Total ▼ | Released | Context | CoT | Tooling | Creative Writing | Language | Utility | Reasoning | Text Editing | Rule Following | Hallucination |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Qwen 3.5 397B A17B | 91.73% | Feb 15, 26 | 128k | ✓ | 99.77% | 86.93% | 95.01% | 97.50% | 95.06% | 98.05% | 79.39% | 82.10% |
| Qwen 3.5 122B | 91.53% | Feb 25, 26 | 262k | ✓ | 100.00% | 83.02% | 95.01% | 96.36% | 94.93% | 96.31% | 80.00% | 86.58% |
| Qwen 3.5 27B | 90.85% | Feb 25, 26 | 262k | ✓ | 99.00% | 82.54% | 95.52% | 95.67% | 92.73% | 98.69% | 76.04% | 86.58% |
| Qwen 3.5 35B | 88.00% | Feb 25, 26 | 262k | ✓ | 93.98% | 83.51% | 91.95% | 96.42% | 94.88% | 94.95% | 67.42% | 80.87% |
| Qwen 3.5 Flash | 86.38% | Feb 25, 26 | 1m | ✓ | 87.87% | 83.81% | 91.94% | 96.11% | 94.66% | 92.80% | 63.19% | 80.63% |
| Qwen 3.5 9B | 86.05% | Mar 10, 26 | 262k | ✓ | 97.00% | 84.35% | 88.18% | 94.02% | 92.93% | 85.35% | 60.98% | 85.58% |
| Qwen 3.5 Plus (2026-02-15) | 85.96% | Feb 15, 26 | 1m | – | 99.74% | 77.07% | 95.10% | 86.65% | 93.45% | 98.10% | 64.21% | 73.35% |
| Qwen 3 32B | 82.21% | Apr 28, 25 | 41k | – | 97.43% | 81.30% | 84.61% | 81.66% | 86.35% | 89.95% | 46.83% | 89.56% |
| Qwen3 235B A22B Instruct 2507 | 80.10% | Jul 21, 25 | 262.1k | – | 99.23% | 84.81% | 60.83% | 83.15% | 85.82% | 91.75% | 65.42% | 69.82% |
| Qwen 2.5 72B | 75.46% | Sep 19, 24 | 131.1k | – | 99.38% | 75.16% | 68.95% | 76.43% | 83.43% | 89.18% | 31.55% | 79.56% |
Model Performance
Cost vs Performance
Compares total benchmark cost against overall score for Qwen models. Quadrant lines are drawn at the median values.
Cost Breakdown
Total benchmark cost per model, broken down by input, reasoning, and output tokens. Toggle between USD and token views.