Vendors

Model creators/vendors and how their models compare across the benchmark.

Vendor Models Avg Score Best Score ▼Best Model
Anthropic1387.23%95.02%Claude Opus 4.6 (Reasoning)
Google1384.01%94.37%Gemini 3.1 Pro (Preview)
OpenAI1985.95%93.24%GPT-5.4 (Reasoning)
Qwen886.99%91.73%Qwen 3.5 397B A17B
Z.AI588.02%91.23%Z.AI GLM 5
MoonshotAI191.04%91.04%MoonshotAI: Kimi K2.5
bytedance-seed483.92%90.70%ByteDance Seed 1.6
xAI387.94%89.55%Grok 4.1 Fast
aion-labs189.21%89.21%Aion 2.0
minimax188.71%88.71%Minimax M2.5
openrouter385.68%87.33%Stealth: Hunter Alpha
Mistral AI1273.37%85.43%Mistral Large 3
DeepSeek583.03%84.83%DeepSeek-V2 Chat
NVIDIA379.02%84.56%Nemotron 3 Super
inception281.67%83.85%Inception Mercury 2
Nous Research277.71%82.86%Hermes 3 405B
Writer179.57%79.57%Writer: Palmyra X5
Meta270.88%78.40%Llama 3.1 70B
arcee-ai272.12%73.33%Arcee AI: Trinity Large (Preview)
Microsoft171.07%71.07%WizardLM 2 8x22b
Cohere169.03%69.03%Cohere Command R+ (Aug. 2024)
Liquid AI158.77%58.77%LFM2 24B
TheDrummer154.55%54.55%Rocinante 12B