Qwen

Comparing 16 models from Qwen.

Model Total ▼ Released Context CoTTooling Creative Writing Language Utility Reasoning Text Editing Rule Following Hallucination
Qwen3.7 Max94.55%May 21, 261m100.00%85.39%97.05%99.54%83.47%98.08%95.76%97.15%
Qwen3.6 Max Preview93.72%Apr 27, 26262.1k100.00%88.42%100.00%98.34%85.79%98.58%82.79%95.86%
Qwen 3.5 397B A17B91.09%Feb 15, 26128k99.81%86.93%95.01%97.50%81.97%98.05%79.39%90.04%
Qwen 3.5 122B90.32%Feb 25, 26262k99.33%83.02%95.01%96.36%79.24%96.31%80.00%93.29%
Qwen 3.5 27B90.05%Feb 25, 26262k99.17%82.54%95.52%95.67%79.44%98.69%76.04%93.29%
Qwen 3.5 Plus (2026-04-20)89.79%Apr 20, 261m95.33%85.18%97.14%96.42%80.60%97.70%67.53%98.38%
Qwen 3.6 Flash89.31%Apr 27, 261m99.78%86.02%89.33%96.09%79.51%96.09%71.50%96.13%
Qwen 3.6 27B88.33%Apr 27, 26262.1k97.91%82.81%89.01%94.32%79.84%93.97%71.37%97.42%
Qwen 3.6 35B87.66%Apr 27, 26262.1k82.67%85.97%93.56%96.20%76.37%95.10%77.34%94.07%
Qwen 3.5 35B87.01%Feb 25, 26262k94.74%83.51%91.95%96.42%77.89%94.95%67.42%89.24%
Qwen 3.5 Plus (2026-02-15)86.17%Feb 15, 261m99.78%77.07%95.10%86.65%81.81%98.10%64.21%86.62%
Qwen 3.5 Flash85.66%Feb 25, 261m89.39%83.81%91.94%96.11%79.34%92.80%63.19%88.70%
Qwen 3.5 9B84.05%Mar 10, 26262k96.84%84.35%88.18%94.02%70.90%85.35%60.98%91.81%
Qwen 3 32B79.37%Apr 28, 2541k95.19%81.30%84.61%81.66%66.35%89.95%46.83%89.06%
Qwen3 235B A22B Instruct 250778.07%Jul 21, 25262.1k93.86%84.81%60.83%83.15%68.43%91.75%65.42%76.34%
Qwen 2.5 72B73.17%Sep 19, 24131.1k96.82%75.16%68.95%76.43%61.71%89.18%31.55%85.54%
Model Performance
Cost vs Performance

Compares total benchmark cost against overall score for Qwen models. Quadrant lines are drawn at the median values.

2 low-scoring outliers hidden: Qwen 2.5 72B (73.2%), Qwen3 235B A22B Instruct 2507 (78.1%).

Cost Breakdown

Total benchmark cost per model, broken down by input, reasoning, and output tokens. Toggle between USD and token views.