Anthropic

Comparing 10 models from Anthropic.

Model Total ▼ Released Context SizeCreative writingRule followingUtilityMathematicsToolingLanguageLogic
Claude Opus 4.688.14%Feb 4, 261m71.27%84.05%92.83%100.00%95.38%91.00%80.63%
Claude Opus 4.584.41%Nov 24, 25200k68.57%84.22%84.50%100.00%87.31%93.75%81.25%
Claude Sonnet 483.20%May 22, 25200k43.05%68.89%95.00%100.00%92.31%87.30%93.75%
Claude 3.5 Sonnet (new)80.29%Oct 22, 24200k38.88%66.81%91.83%100.00%92.69%84.14%85.00%
Claude Opus 480.23%May 22, 25200k54.27%74.82%85.67%100.00%80.00%87.81%87.50%
Claude Sonnet 4.578.78%Sep 29, 251m45.47%73.70%85.00%100.00%83.08%86.13%81.25%
Claude 3.5 Haiku73.73%Oct 22, 24200k38.63%60.94%82.33%100.00%82.31%77.49%87.50%
Claude Haiku 4.572.19%Oct 15, 25200k49.44%59.16%80.00%100.00%72.31%85.84%81.25%
Claude 3.7 Sonnet71.47%Feb 19, 25200k45.49%58.49%80.67%100.00%66.92%88.92%81.25%
Claude 3 Haiku59.75%Mar 13, 24200k36.00%50.47%64.67%95.00%53.46%72.18%80.63%
Model Performance
Cost vs Performance

Compares total benchmark cost against overall score for Anthropic models. Quadrant lines are drawn at the median values.