Language Writing

Can the model generate text in different languages?

Price-Performance Score Distribution (Top 20)

Click a model name to view its detail page.

ScoreCostTime
Stealth: Aurora Alpha100%—2.0s
Inception Mercury96%$0.00021.5s
GPT-4.1 Nano93%$0.00014.0s
GPT-4o Mini (temp=0)100%$0.00034.8s
Mistral NeMO67%$0.00014.3s
Inception Mercury 2100%$0.00061.4s
GPT-4.1 Mini99%$0.00043.4s
GPT-4o Mini (temp=1)100%$0.00035.6s
Arcee AI: Trinity Mini81%$0.000215.9s
Claude 3 Haiku81%$0.00073.8s
Grok 4.394%$0.00093.8s
Gemini 3.1 Flash Lite (Preview)95%$0.00113.7s
Gemini 3.1 Flash Lite97%$0.00116.2s
Gemini 3.1 Flash Lite (Reasoning)98%$0.00115.3s
Nemotron 3 Nano95%$0.000210.8s
DeepSeek V4 Flash (Reasoning)90%$0.000220.8s
Nemotron 3 Super98%$0.000021.7s
DeepSeek V4 Flash87%$0.000212.4s
Mistral Small 3.2 24B71%$0.000311.0s
DeepSeek-V2 Chat100%$0.000116.1s
0.600.700.800.901.00

Cost vs Performance

Compares total cost for this test against the test score. Quadrant lines are drawn at the median values. Only models with available cost data are shown.

14 low-scoring outliers hidden: LFM2 24B (64.3%), WizardLM 2 8x22b (61.1%), Ministral 3B (59.5%), Ministral 8B (52.8%), Rocinante 12B (51.9%), Mistral Medium 3.1 (49.0%), Mistral Small 4 (48.9%), Ministral 3 8B (47.9%), Qwen3 235B A22B Instruct 2507 (46.7%), Writer: Palmyra X5 (43.2%), Ministral 3 3B (36.2%), Mistral Small Creative (33.7%), Llama 3.1 Nemotron 70B (33.6%), Ministral 3 14B (10.0%).

Most Stable Models (Top 20)

Ranked by stability (median × consistency). Click a model name to view its detail page.

ScoreConsistencyStability
Qwen3.6 Max Preview100%100%100%
Grok 4.3 (Reasoning)100%100%100%
Claude Sonnet 4.6100%100%100%
o4 Mini100%100%100%
Gemma 4 31B100%100%100%
Gemini 3 Flash (Preview)100%100%100%
DeepSeek-V2 Chat100%100%100%
Stealth: Aurora Alpha100%100%100%
GPT-4o, Aug. 6th (temp=0)100%100%100%
GPT-4o Mini (temp=1)100%100%100%
GPT-4o Mini (temp=0)100%100%100%
GPT-5.4 Mini (Reasoning, Low)100%99%99%
Z.AI GLM 5 Turbo100%98%98%
GPT-5.5 (Reasoning)99%98%98%
Claude Opus 4.7100%98%98%
Z.AI GLM 4.5100%97%97%
o4 Mini High100%97%97%
Inception Mercury 2100%96%96%
Claude Opus 4.599%96%96%
GPT-5.598%97%96%
90%100%

Top Overall Models (Top 20)

Ranked by composite score (performance, cost, speed & stability). Click a model name to view its detail page.

ScoreCostSpeedStability
Stealth: Aurora Alpha100%—2.0s100%
GPT-4o Mini (temp=0)100%$0.00034.8s100%
GPT-4o Mini (temp=1)100%$0.00035.6s100%
Inception Mercury 2100%$0.00061.4s96%
Gemini 3 Flash (Preview)100%$0.00205.6s100%
GPT-5.4 Mini (Reasoning, Low)100%$0.00223.5s99%
DeepSeek-V2 Chat100%$0.000116.1s100%
GPT-4.1 Mini99%$0.00043.4s93%
GPT-4o, Aug. 6th (temp=0)100%$0.00526.1s100%
Z.AI GLM 4.5100%$0.001314.5s97%
Z.AI GLM 5 Turbo100%$0.003714.7s98%
Hermes 3 405B99%$0.000021.0s94%
Gemma 4 31B100%$0.000332.1s100%
GPT-5.4 Mini97%$0.00202.5s87%
GPT-4o, Aug. 6th (temp=1)99%$0.00566.5s94%
GPT-OSS 120B99%$0.000328.0s96%
Gemini 3.1 Flash Lite (Reasoning)98%$0.00115.3s84%
Nemotron 3 Super98%$0.000021.7s91%
o4 Mini100%$0.007116.7s100%
GPT-5.4 Nano (Reasoning)98%$0.00166.1s83%
60%70%80%90%100%
Model Total â–¼Character dialogue (Spanish) in a storyCharacter dialogue (French) in a storyCharacter dialogue (German) in a storyCharacter dialogue (Italian) in a storyCharacter dialogue (Hindi) in a story
Qwen3.6 Max Preview100%100%100%100%100%100%
Grok 4.3 (Reasoning)100%100%100%100%100%100%
Claude Sonnet 4.6100%100%100%100%100%100%
o4 Mini100%100%100%100%100%100%
Gemma 4 31B100%100%100%100%100%100%
Gemini 3 Flash (Preview)100%100%100%100%100%100%
DeepSeek-V2 Chat100%100%100%100%100%100%
Stealth: Aurora Alpha100%100%100%100%100%100%
GPT-4o, Aug. 6th (temp=0)100%100%100%100%100%100%
GPT-4o Mini (temp=1)100%100%100%100%100%100%
GPT-4o Mini (temp=0)100%100%100%100%100%100%
GPT-5.4 Mini (Reasoning, Low)100%100%100%99%100%100%
Z.AI GLM 5 Turbo100%100%100%100%99%100%
Z.AI GLM 4.5100%100%100%100%98%100%
Claude Opus 4.7100%99%100%99%100%100%
1–15 of 147
Page 1 / 10

Character dialogue (Spanish) in a story

Character dialogue (French) in a story

Character dialogue (German) in a story

Character dialogue (Italian) in a story

Character dialogue (Hindi) in a story