Vendors

Model creators/vendors and how their models compare across the benchmark.

Vendor Models Avg Score Best Score ▼Best Model
Anthropic1387.23%95.02%Claude Opus 4.6 (Reasoning)
Google1384.01%94.37%Gemini 3.1 Pro (Preview)
Z.AI689.06%94.27%Z.AI GLM 5 Turbo
OpenAI2585.08%93.24%GPT-5.4 (Reasoning)
Qwen1085.83%91.73%Qwen 3.5 397B A17B
xAI587.83%91.49%Grok 4.20 (Beta, Reasoning)
MoonshotAI191.04%91.04%MoonshotAI: Kimi K2.5
bytedance-seed483.92%90.70%ByteDance Seed 1.6
aion-labs189.21%89.21%Aion 2.0
minimax288.90%89.10%MiniMax M2.7
openrouter385.69%87.34%Stealth: Hunter Alpha
Mistral AI1474.23%85.43%Mistral Large 3
DeepSeek583.03%84.83%DeepSeek-V2 Chat
NVIDIA379.00%84.56%Nemotron 3 Super
inception281.67%83.85%Inception Mercury 2
Nous Research277.71%82.86%Hermes 3 405B
Writer179.57%79.57%Writer: Palmyra X5
Meta270.88%78.40%Llama 3.1 70B
arcee-ai272.12%73.33%Arcee AI: Trinity Large (Preview)
Microsoft171.07%71.07%WizardLM 2 8x22b
Cohere169.03%69.03%Cohere Command R+ (Aug. 2024)
Liquid AI158.77%58.77%LFM2 24B
TheDrummer154.55%54.55%Rocinante 12B