Dialogue tags

Various tasks related to dialogue tags in text.

Price-Performance Score Distribution (Top 20)

Click a model name to view its detail page.

ScoreCostTime
GPT-5 Mini83%$0.009652.2s
Z.AI GLM 5 Turbo87%$0.0301.3m
Inception Mercury 271%$0.00376.1s
Claude Opus 4.673%$0.01314.7s
GPT-584%$0.0491.4m
Qwen 3.5 27B66%$0.0231.8m
GPT-5.4 (Reasoning)62%$0.02736.1s
Gemini 3 Flash (Preview, Reasoning)61%$0.01730.3s
Claude Opus 4.6 (Reasoning)80%$0.07037.7s
Nemotron 3 Super74%$0.00002.5m
Gemini 3.1 Pro (Preview)99%$0.1351.9m
GPT-5.163%$0.02747.2s
o4 Mini High79%$0.0451.8m
Claude Sonnet 4.667%$0.007412.1s
GPT-5.260%$0.02434.2s
MiniMax M2.781%$0.0214.3m
MiniMax M2.574%$0.0152.9m
o4 Mini68%$0.02358.0s
GPT-5.4 (Reasoning, Low)55%$0.01621.6s
Claude Sonnet 4.6 (Reasoning)75%$0.1011.2m
0.500.600.700.800.901.00

Cost vs Performance

Compares total cost for this test against the test score. Quadrant lines are drawn at the median values. Only models with available cost data are shown.

Most Stable Models (Top 20)

Ranked by stability (median × consistency). Click a model name to view its detail page.

ScoreConsistencyStability
Gemini 3.1 Pro (Preview)99%90%90%
GPT-5 Mini83%48%48%
Z.AI GLM 5 Turbo87%48%48%
o4 Mini High79%47%45%
MiniMax M2.781%45%45%
Claude Opus 4.6 (Reasoning)80%45%44%
GPT-584%43%43%
Claude Sonnet 4.6 (Reasoning)75%40%38%
Claude Opus 4.673%42%35%
Claude Sonnet 4.667%50%35%
Nemotron 3 Super74%36%34%
MiniMax M2.574%43%33%
Inception Mercury 271%38%31%
o4 Mini68%32%23%
Gemini 3.1 Flash Lite (Preview)51%45%23%
Claude Opus 4.564%34%19%
GPT-4o Mini (temp=0)55%37%19%
Grok 449%37%18%
GPT-4o, Aug. 6th (temp=0)57%37%18%
MoonshotAI: Kimi K2.564%22%18%
0%10%20%30%40%50%60%70%80%90%100%

Top Overall Models (Top 20)

Ranked by composite score (performance, cost, speed & stability). Click a model name to view its detail page.

ScoreCostSpeedStability
GPT-5 Mini83%$0.009652.2s48%
Inception Mercury 271%$0.00376.1s31%
Claude Opus 4.673%$0.01314.7s35%
Claude Sonnet 4.667%$0.007412.1s35%
Z.AI GLM 5 Turbo87%$0.0301.3m48%
Gemini 3.1 Flash Lite (Preview)51%$0.00062.7s23%
GPT-4o Mini (temp=0)55%$0.00037.8s19%
Claude Opus 4.564%$0.01313.5s19%
GPT-4o, Aug. 6th (temp=0)57%$0.00496.0s18%
GPT-584%$0.0491.4m43%
Claude Opus 4.6 (Reasoning)80%$0.07037.7s44%
Gemini 3.1 Pro (Preview)99%$0.1351.9m90%
GPT-4o, Aug. 6th (temp=1)51%$0.00506.0s17%
Grok 4 Fast46%$0.00046.4s18%
o4 Mini High79%$0.0451.8m45%
Gemini 3 Flash (Preview)47%$0.00144.8s16%
Nemotron 3 Super74%$0.00002.5m34%
Inception Mercury49%$0.00058.1s11%
o4 Mini68%$0.02358.0s23%
GPT-4o Mini (temp=1)44%$0.00039.4s16%
0%10%20%30%40%50%60%70%80%90%100%
Ungroupeddialogue-200dialogue-500
Model Total ▼Write unattributed dialogueWrite 200 words with 10% dialogueWrite 200 words with 50% dialogueWrite 200 words with 90% dialogueWrite 500 words with 30% dialogueWrite 500 words with 50% dialogueWrite 500 words with 70% dialogue
Gemini 3.1 Pro (Preview)99%100%100%100%99%96%100%100%
Z.AI GLM 5 Turbo87%100%100%98%69%97%95%50%
GPT-584%100%100%95%63%84%95%54%
GPT-5 Mini83%90%100%90%62%87%79%70%
MiniMax M2.781%86%97%96%89%71%72%54%
Claude Opus 4.6 (Reasoning)80%100%100%90%94%68%53%57%
o4 Mini High79%96%100%85%74%80%57%62%
Claude Sonnet 4.6 (Reasoning)75%96%99%99%72%82%50%26%
Nemotron 3 Super74%56%90%90%82%71%57%74%
MiniMax M2.574%86%76%98%62%74%63%55%
Claude Opus 4.673%100%82%79%94%43%71%41%
Inception Mercury 271%80%90%97%83%58%56%34%
o4 Mini68%72%97%89%55%73%67%22%
Claude Sonnet 4.667%92%87%77%77%51%49%37%
Qwen 3.5 27B66%100%96%98%93%46%16%10%
1–15 of 118
Page 1 / 8

dialogue-200

Write 200 words with 10% dialogue

Write 200 words with 50% dialogue

Write 200 words with 90% dialogue

dialogue-500

Write 500 words with 30% dialogue

Write 500 words with 50% dialogue

Write 500 words with 70% dialogue

Ungrouped

Write unattributed dialogue