Literary fiction: old friends reunite

Bad Writing Habits

Detects common prose quality anti-patterns in AI-generated creative writing, including passive voice, past progressive overuse, weak dialogue tags, filter words, purple prose, cliches, AI-ism words/adverbs/names, and more.

Creative Writing Hallucination

Performance Score Distribution (Top 20)

Click a model name to view its detail page.

	Score
Claude Opus 4	92%
GPT-5.4	91%
Writer: Palmyra X5	90%
Qwen3 235B A22B Instruct 2507	89%
Grok 4.20 (Reasoning)	88%
GPT-5.5 (Reasoning, Low)	88%
GPT-5.5	88%
DeepSeek V4 Pro	88%
Claude Sonnet 4	88%
Claude Opus 4.7	88%
GPT-5.4 (Reasoning, Low)	88%
Grok 4.20	88%
Claude Opus 4.8 (Reasoning, Low)	88%
GPT-5.4 Mini	88%
Z.AI GLM 5	88%
Mistral Medium 3.1	87%
GPT-5.5 (Reasoning)	87%
Claude Opus 4.8 (Reasoning)	87%
DeepSeek V4 Pro (Reasoning)	87%
Qwen 3.5 397B A17B	87%

	Score	Cost	Time
Qwen3 235B A22B Instruct 2507	89%	$0.0013	1.2m
Writer: Palmyra X5	90%	$0.014	23.9s
Mistral Medium 3.1	87%	$0.0052	38.3s
Grok 4.20	88%	$0.011	48.0s
Mistral Small 4 (Reasoning)	85%	$0.0025	29.1s
Qwen 3 32B	86%	$0.0019	1.7m
DeepSeek V3 (2025-03-24)	85%	$0.0018	38.5s
GPT-5.4 Mini	88%	$0.015	16.1s
DeepSeek V4 Pro	88%	$0.0043	49.6s
Z.AI GLM 5	88%	$0.011	1.1m
Hermes 3 405B	85%	$0.0054	35.9s
MiniMax M2.7	86%	$0.0040	1.3m
ByteDance Seed 1.6 Flash	84%	$0.0014	27.7s
Mistral Small 4	84%	$0.0022	26.2s
GPT-5.4 Nano (Reasoning, Low)	84%	$0.0044	18.2s
GPT-5.4 Nano	84%	$0.0051	18.4s
GPT-5.4 Mini (Reasoning)	87%	$0.030	32.6s
Z.AI GLM 5.2 (Reasoning, High)	86%	$0.018	1.4m
GPT-5.4 Mini (Reasoning, Low)	86%	$0.015	15.7s
Mistral Large 2	86%	$0.018	32.8s

	Score	Consistency	Stability
Claude Opus 4	92%	96%	89%
GPT-5.5	88%	99%	87%
Qwen3 235B A22B Instruct 2507	89%	96%	86%
Grok 4.20 (Reasoning)	88%	98%	86%
Claude Opus 4.8 (Reasoning, Low)	88%	98%	86%
Writer: Palmyra X5	90%	96%	86%
GPT-5.4	91%	94%	86%
GPT-5.4 (Reasoning, Low)	88%	97%	85%
GPT-5.5 (Reasoning, Low)	88%	97%	85%
GPT-5.4 Mini (Reasoning)	87%	97%	85%
Z.AI GLM 5	88%	97%	85%
DeepSeek V4 Pro (Reasoning)	87%	96%	84%
GPT-5.4 (Reasoning)	87%	96%	84%
Grok 4.20	88%	94%	84%
Qwen 3.5 397B A17B	87%	95%	84%
Claude Sonnet 4.6 (Reasoning)	85%	97%	83%
GPT-5.5 (Reasoning)	87%	96%	83%
GPT-5.4 Mini	88%	96%	83%
Claude Opus 4.7	88%	94%	83%
o4 Mini	84%	99%	83%

	Score	Cost	Speed	Stability
Writer: Palmyra X5	90%	$0.014	23.9s	86%
Qwen3 235B A22B Instruct 2507	89%	$0.0013	1.2m	86%
GPT-5.4 Mini	88%	$0.015	16.1s	83%
Grok 4.20	88%	$0.011	48.0s	84%
GPT-5.4	91%	$0.049	1.4m	86%
Z.AI GLM 5	88%	$0.011	1.1m	85%
Grok 4.20 (Reasoning)	88%	$0.021	1.8m	86%
GPT-5.4 Mini (Reasoning)	87%	$0.030	32.6s	85%
DeepSeek V4 Pro	88%	$0.0043	49.6s	81%
Mistral Medium 3.1	87%	$0.0052	38.3s	81%
Mistral Small 4 (Reasoning)	85%	$0.0025	29.1s	82%
Mistral Large 2	86%	$0.018	32.8s	82%
GPT-5.4 Mini (Reasoning, Low)	86%	$0.015	15.7s	81%
MiniMax M2.7	86%	$0.0040	1.3m	83%
DeepSeek V3 (2025-03-24)	85%	$0.0018	38.5s	81%
GPT-5.4 Nano	84%	$0.0051	18.4s	81%
o4 Mini	84%	$0.014	25.2s	83%
Claude Opus 4.8 (Reasoning, Low)	88%	$0.086	41.5s	86%
GPT-5.4 (Reasoning, Low)	88%	$0.060	1.4m	85%
Qwen 3 32B	86%	$0.0019	1.7m	82%

Rank	Model	Avg. Cost	Avg. Time	Stability	# 1	# 2	# 3	# 4	# 5	Total
71	Claude Opus 4	$0.250	1.5m	89%	94	94	92	90	89	92%
5	GPT-5.4	$0.049	1.4m	86%	94	93	91	89	86	91%
1	Writer: Palmyra X5	$0.014	23.9s	86%	93	90	89	89	88	90%
2	Qwen3 235B A22B Instruct 2507	$0.0013	1.2m	86%	92	91	90	89	85	89%
7	Grok 4.20 (Reasoning)	$0.021	1.8m	86%	90	88	88	88	87	88%
56	GPT-5.5 (Reasoning, Low)	$0.129	1.6m	85%	90	89	88	88	86	88%
51	GPT-5.5	$0.132	1.8m	87%	89	88	88	88	87	88%
9	DeepSeek V4 Pro	$0.0043	49.6s	81%	92	92	86	86	84	88%
22	Claude Sonnet 4	$0.043	51.6s	81%	93	89	86	86	86	88%
31	Claude Opus 4.7	$0.089	31.6s	83%	93	88	88	86	85	88%
19	GPT-5.4 (Reasoning, Low)	$0.060	1.4m	85%	90	89	88	87	85	88%
4	Grok 4.20	$0.011	48.0s	84%	91	90	89	85	83	88%
18	Claude Opus 4.8 (Reasoning, Low)	$0.086	41.5s	86%	89	88	88	87	86	88%
3	GPT-5.4 Mini	$0.015	16.1s	83%	91	88	87	86	86	88%
6	Z.AI GLM 5	$0.011	1.1m	85%	90	89	87	87	85	88%
10	Mistral Medium 3.1	$0.0052	38.3s	81%	92	88	88	87	81	87%
86	GPT-5.5 (Reasoning)	$0.147	1.9m	83%	91	88	87	86	85	87%
43	Claude Opus 4.8 (Reasoning)	$0.086	40.6s	82%	92	88	88	85	83	87%
61	DeepSeek V4 Pro (Reasoning)	$0.026	4.9m	84%	89	88	88	85	84	87%
44	Qwen 3.5 397B A17B	$0.012	3.7m	84%	89	89	88	84	84	87%
89	GPT-5.4 (Reasoning)	$0.119	3.4m	84%	89	89	88	85	84	87%
50	Claude Opus 4.5	$0.082	51.4s	81%	90	90	86	85	83	87%
8	GPT-5.4 Mini (Reasoning)	$0.030	32.6s	85%	88	87	87	87	84	87%
24	Z.AI GLM 5.2 (Reasoning, High)	$0.018	1.4m	82%	90	88	86	86	82	86%
20	Qwen 3 32B	$0.0019	1.7m	82%	90	87	87	86	81	86%
28	Mistral Large 3	$0.0051	40.8s	78%	94	86	85	84	82	86%
34	Claude Sonnet 4.5	$0.048	47.9s	80%	90	90	86	83	82	86%
132	MoonshotAI: Kimi K2.6	$0.083	8.1m	79%	90	90	85	84	81	86%
70	Z.AI GLM 5.1	$0.024	3.4m	79%	91	88	85	84	81	86%
13	GPT-5.4 Mini (Reasoning, Low)	$0.015	15.7s	81%	91	85	85	85	84	86%
12	Mistral Large 2	$0.018	32.8s	82%	89	86	86	86	82	86%
40	Claude Sonnet 4.6	$0.041	39.3s	79%	90	88	85	83	82	86%
64	GPT-5	$0.062	2.5m	82%	90	86	85	84	84	86%
14	MiniMax M2.7	$0.0040	1.3m	83%	87	87	86	86	82	86%
32	MiniMax M2.5	$0.0044	2.2m	82%	89	87	87	83	81	85%
75	Qwen3.6 Max Preview	$0.049	3.1m	80%	90	88	87	81	81	85%
26	Xiaomi MIMO v2.5 Pro	$0.011	1.1m	81%	89	87	86	84	81	85%
47	Aion 3.0	$0.041	1.7m	82%	88	87	87	84	80	85%
11	Mistral Small 4 (Reasoning)	$0.0025	29.1s	82%	88	87	86	85	80	85%
33	Qwen 3.5 Plus (2026-04-20)	$0.018	1.7m	82%	87	87	85	83	82	85%
63	Claude Opus 4.6	$0.088	1.1m	81%	88	87	86	82	81	85%
15	DeepSeek V3 (2025-03-24)	$0.0018	38.5s	81%	87	87	85	85	80	85%
99	Claude Opus 4.6 (Reasoning)	$0.114	1.5m	78%	90	87	84	83	81	85%
80	MoonshotAI: Kimi K2.5	$0.024	4.3m	81%	87	87	85	84	80	85%
62	GPT-5.1	$0.055	2.1m	81%	88	86	84	83	83	85%
36	Hermes 3 405B	$0.0054	35.9s	77%	93	86	86	84	76	85%
41	Aion 3.0 Mini	$0.0068	1.1m	78%	89	88	84	84	78	85%
131	MiniMax M3	$0.013	8.7m	76%	93	85	84	83	79	85%
88	Claude Sonnet 4.6 (Reasoning)	$0.118	2.0m	83%	86	86	85	84	83	85%
38	DeepSeek V4 Flash	$0.0007	22.5s	75%	93	85	83	81	81	85%
27	Claude Sonnet 5	$0.035	35.4s	83%	86	86	86	83	82	85%
35	Claude Sonnet 5 (Reasoning, Low)	$0.049	53.2s	83%	86	86	85	83	83	84%
49	Qwen 3.6 Flash	$0.012	44.6s	77%	92	85	84	82	78	84%
16	GPT-5.4 Nano	$0.0051	18.4s	81%	87	85	84	83	82	84%
48	Grok 4.3 (Reasoning)	$0.020	2.1m	82%	86	85	84	83	83	84%
21	Grok 4.3	$0.0088	27.4s	81%	87	85	84	84	82	84%
72	Claude Opus 4.7 (Reasoning)	$0.102	35.9s	80%	88	87	86	81	79	84%
58	GPT-5.2	$0.057	1.5m	82%	85	85	84	83	83	84%
23	GPT-5.4 Nano (Reasoning, Low)	$0.0044	18.2s	80%	87	86	84	82	81	84%
55	ByteDance Seed 2.0 Lite	$0.012	2.0m	80%	87	86	85	81	81	84%
54	Z.AI GLM 5 Turbo	$0.011	30.3s	76%	89	87	82	82	79	84%
25	GPT-5.4 Nano (Reasoning)	$0.0059	25.8s	81%	86	86	84	82	81	84%
60	Claude Sonnet 5 (Reasoning)	$0.037	55.9s	79%	87	86	83	81	81	84%
17	o4 Mini	$0.014	25.2s	83%	84	84	84	83	83	84%
39	WizardLM 2 8x22b	$0.0039	2.1m	82%	85	85	84	82	82	84%
29	ByteDance Seed 1.6 Flash	$0.0014	27.7s	80%	87	85	84	83	79	84%
45	DeepSeek V4 Flash (Reasoning)	$0.0007	32.5s	77%	88	87	82	81	79	84%
30	Claude Haiku 4.5	$0.017	27.9s	81%	86	84	84	84	80	84%
37	Mistral Small 4	$0.0022	26.2s	77%	90	85	84	81	77	84%
42	GPT-4.1	$0.017	54.4s	80%	86	85	84	83	79	83%
53	Qwen 3.6 35B	$0.0085	1.0m	78%	88	84	83	82	79	83%
92	Qwen3.7 Max	$0.078	2.4m	81%	85	83	83	83	82	83%
46	Qwen 3.5 9B	$0.0013	1.2m	79%	87	84	83	83	79	83%
116	Gemini 3.1 Pro (Preview)	$0.148	2.3m	81%	84	84	83	82	80	83%
74	Ministral 3 14B	$0.0012	16.5s	71%	91	87	79	78	77	83%
65	DeepSeek V3.2	$0.0021	1.7m	77%	86	85	82	82	77	82%
113	Qwen 3.5 27B	$0.051	4.1m	78%	86	84	82	81	79	82%
52	Qwen 3.5 Flash	$0.0029	52.9s	79%	85	83	83	79	78	82%
105	ByteDance Seed 2.0 Mini	$0.0041	4.0m	76%	86	83	81	80	79	82%
57	Z.AI GLM 4.6	$0.0050	30.4s	77%	86	83	81	80	78	82%
82	Aion 2.0	$0.0083	1.3m	75%	86	86	81	78	76	82%
68	Qwen 3.5 35B	$0.019	1.0m	77%	84	84	83	82	74	81%
59	GPT-5 Mini	$0.010	1.0m	79%	83	82	82	81	78	81%
107	Qwen 3.5 122B	$0.078	2.8m	80%	83	82	82	80	79	81%
73	DeepSeek V3 (2024-12-26)	$0.0027	44.5s	74%	87	85	80	78	76	81%
83	o4 Mini High	$0.042	1.3m	78%	83	81	81	80	79	81%
67	ByteDance Seed 1.6	$0.0093	1.7m	79%	82	82	81	80	79	81%
76	Gemini 2.5 Flash (Reasoning)	$0.011	21.0s	74%	87	82	80	79	76	81%
78	Z.AI GLM 4.5	$0.0062	46.2s	75%	85	83	79	79	77	81%
84	DeepSeek-V2 Chat	$0.0027	50.8s	74%	87	81	79	78	77	80%
110	Z.AI GLM 4.5 Air	$0.0052	1.2m	68%	89	86	76	76	76	80%
109	Gemini 3.5 Flash (Reasoning)	$0.078	40.3s	74%	87	80	80	78	77	80%
66	Xiaomi MIMO v2.5	$0.0055	31.1s	76%	84	82	80	78	77	80%
77	Z.AI GLM 4.7	$0.012	1.8m	79%	81	81	80	80	78	80%
96	Gemma 4 31B (Reasoning)	$0.0017	2.5m	76%	83	82	80	79	76	80%
90	Ministral 3 8B	$0.0010	11.5s	71%	88	84	82	77	68	80%
79	Qwen 3.5 Plus (2026-02-15)	$0.0070	28.0s	75%	83	83	79	79	76	80%
95	Z.AI GLM 4.7 Flash	$0.0018	1.8m	75%	84	81	80	79	75	80%
97	Gemini 2.5 Pro	$0.039	36.9s	75%	83	82	81	81	71	80%
69	Ministral 8B	$0.0006	11.8s	75%	82	81	80	79	73	79%
87	Gemma 4 31B	$0.0012	1.1m	75%	82	81	81	79	73	79%
93	Gemini 3.5 Flash (Reasoning, Minimal)	$0.018	12.0s	73%	85	79	78	77	75	79%
91	Gemma 3 27B	$0.0010	55.7s	75%	82	82	80	75	74	79%
81	Ministral 3B	$0.0002	5.4s	74%	83	79	79	77	75	79%
117	Qwen 3.6 27B	$0.032	3.1m	76%	80	80	78	77	76	78%
102	Gemini 2.5 Flash	$0.0046	9.1s	70%	82	82	76	75	75	78%
85	Qwen 2.5 72B	$0.0016	38.6s	76%	80	79	78	78	75	78%
127	Hermes 3 70B	$0.0017	2.3m	68%	86	84	80	72	67	78%
112	Gemma 4 26B (Reasoning)	$0.0012	2.1m	73%	81	80	77	77	74	78%
100	Gemma 3 12B	$0.0004	40.3s	73%	81	80	77	76	74	78%
94	Gemini 3 Flash (Preview, Reasoning)	$0.012	29.5s	75%	80	79	77	76	76	78%
101	GPT-4.1 Mini	$0.0026	16.2s	72%	82	79	76	75	75	77%
114	Ministral 3 3B	$0.0019	58.4s	70%	84	79	76	75	71	77%
103	Cydonia 24B V4.1	$0.0021	53.4s	73%	81	77	77	77	73	77%
98	Gemma 4 26B	$0.0012	34.1s	74%	81	78	77	76	73	77%
108	GPT-4o, Aug. 6th (temp=1)	$0.021	23.6s	72%	80	80	77	76	71	77%
115	DeepSeek V3.1	$0.0023	1.9m	73%	80	79	78	75	72	77%
111	Mistral NeMO	$0.0009	9.6s	69%	81	81	75	74	70	76%
106	Nemotron 3 Super	$0.0000	43.1s	72%	81	78	77	73	71	76%
126	Cohere Command R+ (Aug. 2024)	$0.023	46.6s	69%	82	75	74	74	73	76%
104	Arcee AI: Trinity Mini	$0.0004	9.5s	72%	78	78	76	74	71	76%
130	GPT-4o, Aug. 6th (temp=0)	$0.050	34.3s	68%	80	77	73	72	71	75%
124	Llama 3.1 70B	$0.0025	35.8s	69%	81	76	76	71	68	75%
119	Gemini 3 Flash (Preview)	$0.0084	17.9s	69%	79	76	74	73	70	75%
118	Gemini 3.1 Flash Lite	$0.0035	15.3s	70%	79	75	74	71	71	74%
123	Gemini 2.5 Flash Lite (Reasoning)	$0.0031	33.0s	70%	76	76	75	73	68	74%
121	Gemini 3.1 Flash Lite (Preview)	$0.0037	8.9s	69%	78	74	73	72	72	74%
120	Gemini 2.5 Flash Lite	$0.0011	11.3s	70%	77	76	74	70	70	73%
125	Inception Mercury 2	$0.0031	6.3s	69%	77	76	75	72	66	73%
129	Gemini 3.1 Flash Lite (Reasoning)	$0.0036	9.5s	66%	82	76	74	67	67	73%
122	GPT-4o Mini (temp=0)	$0.0015	46.9s	72%	74	73	73	72	72	73%
133	GPT-OSS 120B	$0.0011	2.1m	69%	77	74	73	71	69	73%
134	GPT-4o Mini (temp=1)	$0.0016	42.6s	63%	81	76	70	69	67	73%
138	Mistral Small 3.2 24B	$0.010	7.0m	66%	78	75	74	73	62	72%
128	GPT-4.1 Nano	$0.0008	16.6s	67%	77	73	71	70	69	72%
136	GPT-5 Nano	$0.0042	1.5m	65%	77	71	70	69	67	71%
135	Gemma 3 4B	$0.0003	22.0s	64%	78	73	71	69	64	71%
137	Nemotron 3 Nano	$0.0013	1.8m	64%	78	76	72	65	64	71%
81.94%

Median	Evaluator	Top 3	Flop 3
100.0%	"Not X but Y" pattern overuse	100Qwen 3.5 35B 100Nemotron 3 Nano 100Mistral Small 3.2 24B	0GPT-5 Nano 76Cydonia 24B V4.1 76GPT-5.4 (Reasoning, Low)
49.4%	Adverb-first sentence starts	100Writer: Palmyra X5 100Qwen3 235B A22B Instruct 2507 100Ministral 3 8B	0Grok 4.3 (Reasoning) 0Gemini 3.5 Flash (Reasoning, Minimal) 0Gemini 3.1 Flash Lite (Reasoning)
100.0%	Adverbs in dialogue tags	100Qwen 3 32B 100Claude Opus 4.7 (Reasoning) 100Claude Sonnet 5	13GPT-4.1 Nano 30Cydonia 24B V4.1 43Claude Haiku 4.5
90.1%	AI-ism adverb frequency	99ByteDance Seed 2.0 Lite 99MoonshotAI: Kimi K2.6 98Claude Opus 4.6	63GPT-4.1 Nano 66Z.AI GLM 4.5 66Gemma 3 4B
100.0%	AI-ism character names	100Nemotron 3 Super 100Z.AI GLM 4.7 100GPT-5.5 (Reasoning, Low)	76Claude Opus 4 80Claude Opus 4.5 80Z.AI GLM 5.1
100.0%	AI-ism location names	100Qwen3.7 Max 100Gemini 3.5 Flash (Reasoning, Minimal) 100Gemma 4 31B (Reasoning)	—
55.4%	AI-ism word frequency	93Claude Opus 4.7 88Claude Opus 4.7 (Reasoning) 86GPT-5	0GPT-4o Mini (temp=0) 2GPT-4o Mini (temp=1) 2GPT-4.1 Nano
100.0%	Cliché density	100Hermes 3 405B 100GPT-5.4 Mini (Reasoning) 100Grok 4.3	13GPT-4o Mini (temp=0) 33Inception Mercury 2 40Llama 3.1 70B
100.0%	Dialogue tag variety (said vs. fancy)	100Qwen 3.5 35B 100GPT-5.4 Mini 100GPT-5.4 Nano	9Hermes 3 70B 19GPT-4o Mini (temp=1) 33GPT-4o, Aug. 6th (temp=1)
100.0%	Em-dash & semicolon overuse	100Z.AI GLM 5 100Qwen 3.5 Plus (2026-02-15) 100Z.AI GLM 4.7	0ByteDance Seed 1.6 0GPT-4.1 Nano 0GPT-4o, Aug. 6th (temp=1)
100.0%	Emotion telling (show vs. tell)	100GPT-5.4 Mini 100GPT-4.1 100Qwen 3.5 27B	76Mistral Small 3.2 24B 85Llama 3.1 70B 91GPT-4o, Aug. 6th (temp=0)
100.0%	Filter word density	100GPT-5.4 Mini 100Gemma 4 26B 100Grok 4.20 (Reasoning)	64Nemotron 3 Nano 67Llama 3.1 70B 80Cohere Command R+ (Aug. 2024)
100.0%	Gibberish response detection	100Gemini 2.5 Flash Lite 100Gemma 4 26B 100Qwen 2.5 72B	80Qwen 3.5 9B 80Llama 3.1 70B 80Z.AI GLM 4.5 Air
100.0%	Markdown formatting overuse	100Z.AI GLM 5 Turbo 100GPT-5 100Claude Opus 4.6	80Qwen 3.5 35B 80Ministral 3 8B 80Ministral 3B
100.0%	Missing dialogue indicators (quotation marks)	100Hermes 3 405B 100Aion 2.0 100Ministral 3 3B	60Qwen3.6 Max Preview 60GPT-5 78Qwen 3.5 Flash
46.7%	Name drop frequency	100Gemma 3 12B 97GPT-4.1 Nano 97GPT-5 Nano	0GPT-5.4 (Reasoning) 0GPT-5.2 0Qwen 3.5 35B
87.6%	Narrator intent-glossing	100MiniMax M2.7 100DeepSeek V4 Pro 100Gemini 3.1 Pro (Preview)	0GPT-5 Nano 31Gemini 2.5 Flash Lite 32Cydonia 24B V4.1
100.0%	Overuse of "that" (subordinate clause padding)	100GPT-OSS 120B 100Claude Sonnet 5 (Reasoning) 100Qwen 3.5 397B A17B	86Claude Sonnet 5 (Reasoning, Low) 86ByteDance Seed 1.6 88Claude Haiku 4.5
100.0%	Paragraph length variance	100Gemini 3 Flash (Preview, Reasoning) 100Qwen3 235B A22B Instruct 2507 100Gemma 4 31B	58Mistral Small 3.2 24B 70Grok 4.3 (Reasoning) 71GPT-OSS 120B
98.9%	Passive voice overuse	100GPT-5.4 Mini (Reasoning, Low) 100Claude Opus 4.6 (Reasoning) 100o4 Mini High	84Llama 3.1 70B 92Claude Sonnet 4.6 92Gemini 3 Flash (Preview)
100.0%	Past progressive (was/were + -ing) overuse	100Qwen 3.5 397B A17B 100Qwen 3.6 27B 100GPT-5.1	35Claude Opus 4.7 (Reasoning) 44Z.AI GLM 4.7 Flash 49Z.AI GLM 4.7
73.7%	Pronoun-first sentence starts	100Claude Sonnet 4.6 100GPT-5.4 (Reasoning) 100Claude Opus 4.8 (Reasoning)	0Gemma 3 4B 1Gemini 3.1 Flash Lite (Reasoning) 1GPT-4.1 Nano
97.6%	Purple prose (modifier overload)	100Grok 4.3 (Reasoning) 100Claude Opus 4.7 100Z.AI GLM 5 Turbo	69Cydonia 24B V4.1 74Mistral Small 3.2 24B 77Gemma 3 4B
100.0%	Repeated phrase echo	100GPT-4o Mini (temp=0) 100GPT-5.4 Nano (Reasoning, Low) 100Claude Opus 4.5	—
100.0%	Sentence length variance	100Nemotron 3 Nano 100Hermes 3 405B 100DeepSeek V3.2	84Mistral Small 3.2 24B 98GPT-4o, Aug. 6th (temp=0) 98Inception Mercury 2
54.3%	Sentence opener variety	94GPT-4o, Aug. 6th (temp=1) 89GPT-4o Mini (temp=1) 86Claude Sonnet 5 (Reasoning)	35Qwen 3.5 35B 36Gemma 4 26B 36Qwen 3.5 Flash
15.4%	Subject-first sentence starts	96Writer: Palmyra X5 90Qwen3 235B A22B Instruct 2507 82Claude Opus 4	0Inception Mercury 2 0Qwen 3.5 122B 0GPT-OSS 120B
29.0%	Subordinate conjunction sentence starts	72GPT-4.1 Nano 71Claude Haiku 4.5 71Claude Opus 4	0Gemma 4 26B (Reasoning) 0Claude Opus 4.6 (Reasoning) 0GPT-4o Mini (temp=0)
81.0%	Technical jargon density	100DeepSeek V3 (2025-03-24) 100GPT-4.1 100Qwen 3.5 9B	8GPT-5 Nano 18Cydonia 24B V4.1 25Claude Sonnet 5 (Reasoning, Low)
77.0%	Useless dialogue additions	100Aion 3.0 Mini 100GPT-5.4 (Reasoning, Low) 100Qwen 3.5 397B A17B	0Nemotron 3 Nano 0Mistral Small 3.2 24B 0Ministral 3 3B

Bad Writing Habits

Literary fiction: old friends reunite

Performance Score Distribution (Top 20)

Price-Performance Score Distribution (Top 20)

Most Stable Models (Top 20)

Top Overall Models (Top 20)