Vendors

Model creators/vendors and how their models compare across the benchmark.

Vendor Models Avg Score Best Score ▼Best Model
MoonshotAI188.77%88.77%MoonshotAI: Kimi K2.5
Anthropic1077.22%88.14%Claude Opus 4.6
OpenAI1174.24%86.94%o4 Mini High
Google768.94%85.69%Gemini 3 Pro (Preview)
Z.AI580.58%85.57%Z.AI GLM 4.7
Meta862.10%75.12%Llama 3.1 405B
DeepSeek171.45%71.45%DeepSeek-V2 Chat
NVIDIA169.72%69.72%Llama 3.1 Nemotron 70B
Envoid168.29%68.29%Llama 3 TenyxChat-DaybreakStorywriter 70B
Writer166.98%66.98%Writer: Palmyra X5
Mistral AI654.91%65.49%Mistral Large 2
Nous Research262.21%64.82%Hermes 3 405B
Qwen162.93%62.93%Qwen 2.5 72B
Sao10K355.69%60.77%Sao10K L3.1 70B Hanami x1
Microsoft450.55%55.49%Phi-3.5 Mini 128k
Inflection AI251.62%54.05%Inflection 3 (Productivity)
Cohere150.95%50.95%Cohere Command R+ (Aug. 2024)
Alpindale149.62%49.62%Goliath 120B
TheDrummer147.32%47.32%Rocinante 12B
NeverSleep144.80%44.80%Lumimaid v0.2 8B
DavidAU138.49%38.49%MN GRAND Gutenberg Lyra4 12B Madness
Gryphe135.63%35.63%MythoMax 13B