Comparing 7 models from Google.
| Model | Total ▼ | Released | Context | Size | Creative writing | Rule following | Utility | Mathematics | Tooling | Language | Logic |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Gemini 3 Pro (Preview) | 85.69% | Nov 18, 25 | 1m | – | 43.43% | 82.40% | 91.33% | 100.00% | 100.00% | 84.25% | 87.50% |
| Gemini 2.5 Pro | 85.11% | Jun 17, 25 | 1m | – | 50.41% | 81.23% | 87.33% | 100.00% | 99.23% | 85.98% | 92.50% |
| Gemini 3 Flash (Preview) | 75.22% | Dec 17, 25 | 1m | – | 48.67% | 83.76% | 74.39% | 100.00% | 56.28% | 90.51% | 87.50% |
| Gemini 2.5 Flash Lite | 69.32% | Jul 22, 25 | 1m | – | 39.17% | 63.24% | 75.00% | 100.00% | 63.08% | 78.79% | 87.50% |
| Gemini 2.5 Flash | 62.34% | Jun 17, 25 | 1m | – | 28.34% | 49.25% | 70.50% | 100.00% | 51.92% | 83.58% | 87.50% |
| Gemma 2 27B | 53.89% | Jun 27, 24 | 4k | 27B | 23.28% | 44.79% | 52.72% | 95.00% | 51.28% | 80.31% | 80.63% |
| Gemma 2 9B | 51.00% | Jun 27, 24 | 8k | 9B | 22.73% | 49.77% | 53.39% | 25.00% | 45.90% | 64.90% | 66.25% |
Model Performance
Cost vs Performance
Compares total benchmark cost against overall score for Google models. Quadrant lines are drawn at the median values.