Mistral AI
Comparing 14 models from Mistral AI.
| Model | Total ▼ | Released | Context | CoT | Tooling | Creative Writing | Language | Utility | Reasoning | Text Editing | Rule Following | Hallucination |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Mistral Large 3 | 85.43% | Dec 1, 25 | 262k | – | 99.66% | 81.21% | 92.02% | 84.91% | 88.95% | 94.09% | 64.41% | 78.17% |
| Mistral Large 2 | 82.41% | Jul 24, 24 | 128k | – | 99.78% | 81.86% | 85.22% | 69.19% | 88.20% | 94.16% | 63.05% | 77.87% |
| Mistral Small 4 (Reasoning) | 82.39% | Mar 16, 26 | 265k | ✓ | 99.73% | 81.67% | 60.53% | 85.61% | 87.78% | 90.58% | 60.28% | 92.98% |
| Mistral Large | 80.15% | Feb 26, 24 | 32k | – | 98.67% | 82.02% | 88.64% | 73.04% | 76.31% | 95.14% | 49.87% | 77.50% |
| Mistral Small 3.2 24B | 78.60% | Jun 20, 25 | 131k | – | 99.89% | 71.87% | 72.77% | 73.17% | 81.71% | 89.48% | 64.08% | 75.83% |
| Mistral Medium 3.1 | 77.83% | Aug 13, 25 | 131k | – | 97.50% | 81.70% | 49.50% | 80.13% | 89.32% | 93.77% | 48.60% | 82.09% |
| Mistral Small 4 | 76.46% | Mar 16, 26 | 265k | – | 95.02% | 81.12% | 51.96% | 78.28% | 78.72% | 91.00% | 62.17% | 73.41% |
| Mistral Small Creative | 73.27% | Dec 16, 25 | 32k | – | 86.85% | 80.29% | 41.85% | 76.28% | 87.99% | 90.31% | 48.15% | 74.46% |
| Ministral 3 14B | 72.54% | Dec 2, 25 | 262k | – | 91.91% | 79.11% | 30.00% | 79.03% | 83.24% | 86.20% | 50.83% | 79.99% |
| Ministral 3 8B | 71.76% | Dec 2, 25 | 262k | – | 99.42% | 77.26% | 48.96% | 74.43% | 71.64% | 78.52% | 31.34% | 92.52% |
| Ministral 3 3B | 67.22% | Dec 2, 25 | 131k | – | 93.79% | 75.45% | 68.10% | 72.38% | 71.88% | 69.80% | 15.87% | 70.45% |
| Mistral NeMO | 65.04% | Jul 18, 24 | 128k | – | 83.21% | 76.72% | 80.80% | 51.55% | 57.59% | 73.69% | 34.11% | 62.63% |
| Ministral 8B | 64.87% | Oct 16, 24 | 128k | – | 85.58% | 76.87% | 53.91% | 46.82% | 73.78% | 77.52% | 15.27% | 89.19% |
| Ministral 3B | 61.29% | Oct 16, 24 | 128k | – | 87.64% | 75.49% | 42.25% | 49.17% | 69.70% | 70.91% | 24.45% | 70.75% |
Model Performance
Cost vs Performance
Compares total benchmark cost against overall score for Mistral AI models. Quadrant lines are drawn at the median values.
Cost Breakdown
Total benchmark cost per model, broken down by input, reasoning, and output tokens. Toggle between USD and token views.