Vendors

Model creators/vendors and how their models compare across the benchmark.

Vendor	Models	Avg Score	Best Score ▼	Best Model
Anthropic	14	88.10%	95.02%	Claude Opus 4.6 (Reasoning)
Qwen	15	87.58%	94.54%	Qwen3.6 Max Preview
Google	19	85.28%	94.37%	Gemini 3.1 Pro (Preview)
Z.AI	8	88.98%	94.37%	Z.AI GLM 5.1
xAI	9	87.17%	93.60%	Grok 4.3 (Reasoning)
OpenAI	29	85.80%	93.24%	GPT-5.4 (Reasoning)
MoonshotAI	2	91.67%	92.31%	MoonshotAI: Kimi K2.6
bytedance-seed	4	83.92%	90.70%	ByteDance Seed 1.6
DeepSeek	9	84.32%	90.10%	DeepSeek V4 Pro (Reasoning)
aion-labs	1	89.21%	89.21%	Aion 2.0
minimax	2	88.90%	89.10%	MiniMax M2.7
xiaomi	2	86.20%	87.36%	Xiaomi MIMO v2.5 Pro
openrouter	3	85.69%	87.34%	Stealth: Hunter Alpha
Mistral AI	14	74.23%	85.43%	Mistral Large 3
NVIDIA	3	79.00%	84.56%	Nemotron 3 Super
inception	2	81.67%	83.85%	Inception Mercury 2
Nous Research	2	77.71%	82.86%	Hermes 3 405B
Writer	1	79.57%	79.57%	Writer: Palmyra X5
Meta	2	70.88%	78.40%	Llama 3.1 70B
arcee-ai	2	72.12%	73.33%	Arcee AI: Trinity Large (Preview)
Microsoft	1	71.06%	71.06%	WizardLM 2 8x22b
Cohere	1	69.03%	69.03%	Cohere Command R+ (Aug. 2024)
Liquid AI	1	58.77%	58.77%	LFM2 24B
TheDrummer	1	54.54%	54.54%	Rocinante 12B