Dialogue tags

Various tasks related to dialogue tags in text.

Write 200 words with 50% dialogue

0-shot Creative writingRule following
Model Run 1 Run 2 Run 3 Run 4 Run 5 Run 6 Run 7 Run 8 Run 9 Run 10 Total
o4 Mini100%100%100%100%100%99%94%91%50%50%88%
o4 Mini High100%100%100%100%99%99%98%50%50%50%85%
MoonshotAI: Kimi K2.5100%100%93%93%92%87%76%72%64%50%83%
GPT-4o, Aug. 6th (temp=0)100%100%99%90%88%86%71%71%64%51%82%
Claude Opus 4.6100%100%100%100%93%93%51%51%50%49%79%
Gemini 2.5 Pro98%98%97%94%93%84%68%67%45%36%78%
Gemini 3 Pro (Preview)100%100%98%95%91%83%50%48%48%38%75%
Z.AI GLM 4.7100%100%100%99%95%94%57%50%48%5%75%
Z.AI GLM 4.6100%99%98%93%80%68%59%50%39%1%69%
Claude Opus 4.5100%96%94%79%69%50%50%49%49%47%68%
GPT-4o, May 13th (temp=0)100%97%83%82%81%72%64%50%41%1%67%
GPT-4 Turbo95%95%91%88%71%63%55%50%41%19%67%
GPT-4o, Aug. 6th (temp=1)100%95%83%50%50%50%50%50%49%48%62%
Mistral Large100%99%88%86%80%59%57%51%0%0%62%
Claude Haiku 4.596%88%76%70%68%48%41%30%16%0%53%
Hermes 3 405B96%85%82%58%55%55%41%26%26%2%53%
Llama 3.1 8B100%99%75%62%49%49%48%35%3%0%52%
GPT-4o Mini (temp=1)50%50%50%50%50%50%50%50%49%48%50%
Goliath 120B100%99%55%53%48%47%44%43%1%1%49%
Cohere Command R+ (Apr. 2024)85%84%83%50%50%44%37%36%7%0%47%
GPT-4.150%50%50%50%50%50%50%50%49%14%46%
Claude 3 Haiku74%71%61%51%49%41%41%38%22%14%46%
Hermes 3 70B100%64%62%54%52%47%42%25%10%2%46%
Llama 3.1 405B89%89%50%50%49%43%41%34%5%2%45%
GPT-4o Mini (temp=0)50%50%50%50%49%48%43%43%34%30%45%
Phi-3 Mini 128k100%48%48%47%46%44%36%34%27%13%44%
MythoMax 13B80%73%52%50%50%48%43%40%4%0%44%
DeepSeek-V2 Chat50%50%50%50%50%50%48%48%43%0%44%
AI21 Jamba 1.5 Large99%94%57%51%50%48%16%15%7%0%44%
Llama 3.1 70B93%66%51%49%47%43%35%22%18%14%44%
Gemini 3 Flash (Preview)50%50%50%50%49%48%47%34%30%25%43%
GPT-4.1 Mini83%50%49%49%48%47%45%43%18%0%43%
GPT-4o, May 13th (temp=1)88%50%50%50%49%49%45%44%0%0%43%
GPT-4.1 Nano50%50%50%50%49%49%47%45%30%1%42%
MythoMist 7B92%78%51%50%50%50%41%0%0%0%41%
Mistral Medium79%59%55%51%50%50%28%23%13%0%41%
Llama 3.2 3B95%52%51%48%47%43%24%22%22%0%40%
Llama 3.2 90B (Vision)67%57%50%50%49%38%34%34%18%0%40%
Liquid: LFM 40B MoE99%77%50%50%44%35%22%4%0%0%38%
Hermes 2 Theta 8B94%83%50%49%47%44%10%1%0%0%38%
Claude 3.5 Haiku87%50%50%48%40%38%19%8%7%3%35%
Claude Opus 450%49%49%48%47%45%34%18%3%1%34%
Cohere Command R+ (Aug. 2024)71%50%50%49%42%40%33%2%0%0%34%
Z.AI GLM 4.7 Flash82%64%53%50%47%28%0%0%0%0%32%
Ministral 3B74%50%50%48%46%26%18%0%0%0%31%
Llama 3 70B50%50%50%45%43%30%22%10%7%0%31%
Llama 3 Euryale 70B v2.150%50%49%46%44%32%27%9%0%0%31%
Sao10K L3.1 70B Hanami x150%50%50%49%48%41%14%2%1%0%30%
Inflection 3 (Productivity)79%66%50%49%47%12%0%0%0%0%30%
Claude 3.5 Sonnet (new)84%48%45%41%39%18%14%7%4%3%30%
Claude 2.052%50%49%41%40%33%18%14%4%0%30%
Magnum 72B71%51%50%49%47%30%0%0%0%0%30%
Llama 3.2 11B (Vision)50%49%47%45%42%34%22%6%2%0%30%
Llama 3.2 1B86%47%46%45%42%25%1%0%0%0%29%
AI21 Jamba65%50%49%37%26%22%21%13%6%0%29%
Phi-3 Medium 128k54%50%49%49%36%24%18%8%0%0%29%
Claude Sonnet 484%57%53%37%33%14%6%2%0%0%29%
Mistral Small Creative79%52%50%49%46%3%1%0%0%0%28%
Gemini 2.5 Flash Lite75%50%49%47%41%4%0%0%0%0%27%
Gemma 2 27B50%50%48%44%35%26%7%5%1%0%27%
Qwen 2.5 72B50%50%49%49%26%13%10%7%5%0%26%
Claude 3.0 Sonnet50%45%44%41%26%19%18%3%2%0%25%
MN GRAND Gutenberg Lyra4 12B Madness66%50%50%29%27%15%4%4%0%0%25%
AI21 Jamba 1.5 Mini49%49%45%45%34%18%1%1%0%0%24%
Claude Sonnet 4.569%51%46%22%18%10%8%6%5%3%24%
Qwen 2 72B59%57%50%42%11%9%1%0%0%0%23%
Toppy M 7B50%50%50%49%28%1%1%0%0%0%23%
Claude 3.5 Sonnet46%43%40%34%27%18%15%2%1%0%23%
Phi-3.5 Mini 128k93%50%47%33%2%0%0%0%0%0%22%
Llama 3 TenyxChat-DaybreakStorywriter 70B50%47%41%29%26%14%10%5%2%0%22%
Qwen 2 7B50%50%49%49%10%2%0%0%0%0%21%
Mistral Nemo 12B Celeste50%48%42%41%8%7%6%3%0%0%21%
Inflection 3 (PI)98%59%17%14%13%1%0%0%0%0%20%
Rocinante 12B47%44%40%25%23%17%0%0%0%0%20%
Lumimaid v0.2 8B91%50%50%0%0%0%0%0%0%0%19%
EVA Qwen 2.5 14B50%50%43%41%0%0%0%0%0%0%18%
Claude 3.7 Sonnet45%41%34%18%18%14%5%3%3%0%18%
lzlv 70B50%50%48%21%5%3%2%0%0%0%18%
Mistral NeMO53%50%40%19%1%0%0%0%0%0%16%
Llama 3.1 Nemotron 70B50%39%36%17%15%1%0%0%0%0%16%
Writer: Palmyra X550%47%38%16%1%1%0%0%0%0%15%
Gemini Pro 1.550%48%43%3%0%0%0%0%0%0%14%
Magnum v2 72B86%50%4%3%0%0%0%0%0%0%14%
Gemini 2.5 Flash50%50%22%18%1%1%0%0%0%0%14%
Gemma 2 9B50%45%30%7%0%0%0%0%0%0%13%
Ministral 8B49%48%16%8%3%2%0%0%0%0%13%
WizardLM 2 8x22b40%28%21%20%8%3%1%0%0%0%12%
Llama 3.1 Euryale 70B v2.262%33%18%0%0%0%0%0%0%0%11%
Gemini Flash 1.549%34%18%0%0%0%0%0%0%0%10%
Z.AI GLM 4.534%18%12%10%7%3%2%0%0%0%9%
Mistral Large 249%14%3%1%0%0%0%0%0%0%7%
Fimbulvetr 11B v250%9%0%0%0%0%0%0%0%0%6%
Claude 2.150%0%0%0%0%0%0%0%0%0%5%
36.25%