Qwen
Comparing 15 models from Qwen.
| Model | Total ▼ | Released | Context | CoT | Tooling | Creative Writing | Language | Utility | Reasoning | Text Editing | Rule Following | Hallucination |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Qwen3.6 Max Preview | 94.54% | Apr 27, 26 | 262.1k | ✓ | 100.00% | 88.42% | 100.00% | 98.34% | 96.51% | 98.58% | 82.79% | 91.72% |
| Qwen 3.5 397B A17B | 91.73% | Feb 15, 26 | 128k | ✓ | 99.77% | 86.93% | 95.01% | 97.50% | 95.06% | 98.05% | 79.39% | 82.10% |
| Qwen 3.5 122B | 91.53% | Feb 25, 26 | 262k | ✓ | 100.00% | 83.02% | 95.01% | 96.36% | 94.93% | 96.31% | 80.00% | 86.58% |
| Qwen 3.5 Plus (2026-04-20) | 91.51% | Apr 20, 26 | 1m | – | 96.00% | 85.18% | 97.14% | 96.42% | 94.99% | 97.70% | 67.53% | 97.13% |
| Qwen 3.5 27B | 90.85% | Feb 25, 26 | 262k | ✓ | 99.00% | 82.54% | 95.52% | 95.67% | 92.73% | 98.69% | 76.04% | 86.58% |
| Qwen 3.6 Flash | 90.65% | Apr 27, 26 | 1m | ✓ | 99.73% | 86.02% | 89.33% | 96.09% | 94.20% | 96.09% | 71.50% | 92.26% |
| Qwen 3.6 27B | 89.72% | Apr 27, 26 | 262.1k | ✓ | 97.49% | 82.81% | 89.01% | 94.32% | 93.08% | 93.97% | 71.37% | 95.73% |
| Qwen 3.6 35B | 89.05% | Apr 27, 26 | 262.1k | ✓ | 80.00% | 85.97% | 93.56% | 96.20% | 94.08% | 95.10% | 77.34% | 90.19% |
| Qwen 3.5 35B | 88.00% | Feb 25, 26 | 262k | ✓ | 93.98% | 83.51% | 91.95% | 96.42% | 94.88% | 94.95% | 67.42% | 80.87% |
| Qwen 3.5 Flash | 86.38% | Feb 25, 26 | 1m | ✓ | 87.87% | 83.81% | 91.94% | 96.11% | 94.66% | 92.80% | 63.19% | 80.63% |
| Qwen 3.5 9B | 86.05% | Mar 10, 26 | 262k | ✓ | 97.00% | 84.35% | 88.18% | 94.02% | 92.93% | 85.35% | 60.98% | 85.58% |
| Qwen 3.5 Plus (2026-02-15) | 85.96% | Feb 15, 26 | 1m | – | 99.74% | 77.07% | 95.10% | 86.65% | 93.45% | 98.10% | 64.21% | 73.35% |
| Qwen 3 32B | 82.21% | Apr 28, 25 | 41k | – | 97.43% | 81.30% | 84.61% | 81.66% | 86.35% | 89.95% | 46.83% | 89.56% |
| Qwen3 235B A22B Instruct 2507 | 80.10% | Jul 21, 25 | 262.1k | – | 99.23% | 84.81% | 60.83% | 83.15% | 85.82% | 91.75% | 65.42% | 69.82% |
| Qwen 2.5 72B | 75.46% | Sep 19, 24 | 131.1k | – | 99.38% | 75.16% | 68.95% | 76.43% | 83.43% | 89.18% | 31.55% | 79.56% |
Model Performance
Cost vs Performance
Compares total benchmark cost against overall score for Qwen models. Quadrant lines are drawn at the median values.
1 low-scoring outlier hidden: Qwen 2.5 72B (75.5%).
Cost Breakdown
Total benchmark cost per model, broken down by input, reasoning, and output tokens. Toggle between USD and token views.