Language Writing

Can the model generate text in different languages?

Character dialogue (Spanish) in a story

0-shot Language
Model Run 1 Run 2 Run 3 Run 4 Run 5 Total
Gemini 3 Flash (Preview)100%100%100%100%100%100%
Z.AI GLM 4.5100%100%100%100%100%100%
Z.AI GLM 4.7100%100%100%100%90%98%
GPT-4.1 Nano100%100%100%100%86%97%
Llama 3.2 11B (Vision)100%100%100%93%90%97%
GPT-4.1 Mini100%100%100%92%90%96%
GPT-4o Mini (temp=1)100%100%94%94%93%96%
Hermes 3 405B100%100%100%92%88%96%
GPT-4o Mini (temp=0)100%100%93%92%92%95%
o4 Mini100%100%93%92%90%95%
Mistral Large100%100%100%88%86%95%
GPT-4 Turbo100%100%93%92%85%94%
Claude 3 Haiku100%100%90%89%88%93%
MoonshotAI: Kimi K2.5100%94%93%91%88%93%
Claude 2.0100%100%100%86%80%93%
DeepSeek-V2 Chat100%100%100%83%80%93%
Qwen 2.5 72B100%94%92%89%87%92%
Qwen 2 72B100%100%100%88%71%92%
Llama 3.1 Euryale 70B v2.2100%100%100%100%58%92%
Claude Sonnet 4100%93%93%86%85%91%
Cohere Command R+ (Apr. 2024)100%100%100%86%70%91%
Claude 2.1100%100%100%86%67%90%
Llama 3.2 3B100%100%100%86%67%90%
Phi-3 Mini 128k100%100%100%100%50%90%
GPT-4.1100%100%100%93%57%90%
Claude Opus 4.593%93%88%87%87%90%
Gemini 2.5 Flash100%100%92%91%64%90%
lzlv 70B100%90%88%86%83%89%
Claude Opus 4.691%91%91%91%82%89%
Hermes 3 70B100%100%91%80%75%89%
GPT-4o, Aug. 6th (temp=1)100%90%89%86%80%89%
Llama 3.1 405B100%90%88%88%78%89%
Mistral Medium100%94%85%80%80%88%
Inflection 3 (Productivity)100%88%85%84%82%88%
Claude Opus 493%93%86%85%79%87%
Phi-3 Medium 128k100%100%100%67%67%87%
Llama 3 TenyxChat-DaybreakStorywriter 70B100%100%100%70%63%87%
Claude Sonnet 4.588%88%88%86%82%86%
Claude 3.5 Sonnet89%88%88%84%82%86%
Claude 3.7 Sonnet95%89%88%81%77%86%
Llama 3 70B100%87%82%80%80%86%
o4 Mini High100%91%82%77%76%85%
Inflection 3 (PI)91%87%87%85%78%85%
Claude 3.0 Sonnet92%91%89%83%71%85%
Llama 3.2 90B (Vision)100%90%86%83%67%85%
Gemini 3 Pro (Preview)94%93%89%78%69%85%
AI21 Jamba100%100%86%80%55%84%
Z.AI GLM 4.695%92%83%79%67%83%
Gemini 2.5 Pro100%86%83%72%71%82%
GPT-4o, May 13th (temp=1)100%93%80%69%65%81%
GPT-4o, Aug. 6th (temp=0)89%88%88%78%63%81%
Claude 3.5 Sonnet (new)90%83%83%80%67%81%
Magnum 72B100%77%75%73%67%78%
GPT-4o, May 13th (temp=0)86%83%82%80%58%78%
Toppy M 7B86%83%82%77%58%77%
Llama 3.2 1B100%100%92%56%38%77%
AI21 Jamba 1.5 Large95%93%86%60%50%77%
Cohere Command R+ (Aug. 2024)100%90%89%53%45%76%
Llama 3.1 70B100%75%75%71%55%75%
Magnum v2 72B100%100%91%44%38%75%
Liquid: LFM 40B MoE100%82%75%57%50%73%
Phi-3.5 Mini 128k100%100%60%50%50%72%
AI21 Jamba 1.5 Mini100%91%67%56%46%72%
Claude Haiku 4.589%82%71%63%53%71%
Hermes 2 Theta 8B89%80%78%56%50%70%
Llama 3.1 8B100%89%83%79%0%70%
Gemma 2 27B83%82%64%62%54%69%
Qwen 2 7B100%78%70%50%43%68%
Llama 3 Euryale 70B v2.1100%78%70%50%40%68%
Goliath 120B100%73%71%50%38%66%
Sao10K L3.1 70B Hanami x1100%71%60%50%50%66%
Z.AI GLM 4.7 Flash100%100%67%64%0%66%
MythoMax 13B100%67%60%50%50%65%
Mistral Large 2100%90%85%50%0%65%
Gemini Pro 1.588%73%67%67%29%64%
Mistral NeMO100%92%86%40%0%63%
WizardLM 2 8x22b92%80%79%67%0%63%
Claude 3.5 Haiku80%64%58%55%50%61%
Gemini 2.5 Flash Lite91%89%73%50%0%61%
Writer: Palmyra X5100%100%100%0%0%60%
Lumimaid v0.2 8B100%100%58%33%0%58%
Gemma 2 9B79%69%44%43%43%55%
Mistral Nemo 12B Celeste91%91%50%33%0%53%
Ministral 3B86%55%50%43%23%51%
MN GRAND Gutenberg Lyra4 12B Madness100%79%64%0%0%48%
Rocinante 12B64%50%45%38%33%46%
Fimbulvetr 11B v293%64%58%0%0%43%
Ministral 8B86%73%48%0%0%41%
MythoMist 7B76%60%50%20%0%41%
Gemini Flash 1.588%60%57%0%0%41%
EVA Qwen 2.5 14B100%50%0%0%0%30%
Mistral Small Creative50%50%0%0%0%20%
Llama 3.1 Nemotron 70B50%0%0%0%0%10%
76.99%