Thriller: chase through city streets

Bad Writing Habits

Detects common prose quality anti-patterns in AI-generated creative writing, including passive voice, past progressive overuse, weak dialogue tags, filter words, purple prose, cliches, AI-ism words/adverbs/names, and more.

Creative Writing Hallucination

Performance Score Distribution (Top 20)

Click a model name to view its detail page.

	Score
GPT-5.4	93%
GPT-5.4 (Reasoning, Low)	91%
Qwen3 235B A22B Instruct 2507	91%
Qwen 3.5 397B A17B	91%
Z.AI GLM 5	91%
GPT-5.4 (Reasoning)	91%
Z.AI GLM 5 Turbo	90%
Claude Opus 4.8 (Reasoning, Low)	90%
Claude Sonnet 4.6 (Reasoning)	90%
Claude Opus 4.6 (Reasoning)	90%
GPT-5.5 (Reasoning)	90%
Writer: Palmyra X5	90%
GPT-5.4 Mini (Reasoning)	90%
GPT-5.5 (Reasoning, Low)	90%
GPT-5.5	90%
Claude Sonnet 4.6	90%
Claude Sonnet 4.5	90%
GPT-5.1	90%
DeepSeek V4 Flash (Reasoning)	89%
Z.AI GLM 5.1	89%

	Score	Cost	Time
Z.AI GLM 5 Turbo	90%	$0.0078	26.6s
DeepSeek V4 Pro	86%	$0.0050	50.1s
Writer: Palmyra X5	90%	$0.011	18.7s
Qwen3 235B A22B Instruct 2507	91%	$0.0014	59.9s
DeepSeek V4 Flash	88%	$0.0007	26.2s
DeepSeek V4 Flash (Reasoning)	89%	$0.0005	21.4s
Xiaomi MIMO v2.5 Pro	89%	$0.0081	44.9s
Z.AI GLM 5	91%	$0.0075	44.3s
GPT-5.4 Mini (Reasoning)	90%	$0.023	28.1s
Z.AI GLM 5.2 (Reasoning, High)	89%	$0.011	57.5s
GPT-5.4 Mini (Reasoning, Low)	89%	$0.014	16.8s
GPT-5.4 (Reasoning, Low)	91%	$0.050	1.2m
GPT-5.4	93%	$0.046	1.4m
GPT-5.4 Mini	88%	$0.015	16.7s
Qwen 3.6 Flash	87%	$0.012	47.6s
Z.AI GLM 5.1	89%	$0.020	1.8m
Claude Sonnet 5	89%	$0.033	33.7s
Claude Sonnet 4.6	90%	$0.036	40.3s
Qwen 3.5 Plus (2026-04-20)	87%	$0.017	1.6m
Grok 4.20 (Reasoning)	86%	$0.016	1.1m

	Score	Consistency	Stability
GPT-5.4	93%	99%	91%
Qwen3 235B A22B Instruct 2507	91%	98%	90%
GPT-5.5 (Reasoning)	90%	99%	90%
GPT-5.4 (Reasoning, Low)	91%	97%	89%
GPT-5.4 (Reasoning)	91%	98%	89%
Claude Opus 4.8 (Reasoning, Low)	90%	97%	88%
GPT-5.5	90%	96%	87%
Qwen 3.5 397B A17B	91%	96%	87%
Writer: Palmyra X5	90%	96%	87%
Z.AI GLM 5.1	89%	98%	87%
GPT-5.4 Mini	88%	99%	87%
GPT-5.4 Mini (Reasoning)	90%	97%	87%
GPT-5.5 (Reasoning, Low)	90%	97%	87%
Xiaomi MIMO v2.5 Pro	89%	96%	87%
Aion 3.0	88%	98%	86%
Claude Sonnet 4.6 (Reasoning)	90%	96%	86%
MiniMax M3	88%	97%	86%
Z.AI GLM 5 Turbo	90%	93%	86%
Z.AI GLM 5.2 (Reasoning, High)	89%	96%	86%
GPT-5.1	90%	96%	86%

	Score	Cost	Speed	Stability
Qwen3 235B A22B Instruct 2507	91%	$0.0014	59.9s	90%
GPT-5.4	93%	$0.046	1.4m	91%
Writer: Palmyra X5	90%	$0.011	18.7s	87%
Z.AI GLM 5 Turbo	90%	$0.0078	26.6s	86%
Z.AI GLM 5	91%	$0.0075	44.3s	86%
GPT-5.4 Mini (Reasoning)	90%	$0.023	28.1s	87%
GPT-5.4 (Reasoning, Low)	91%	$0.050	1.2m	89%
DeepSeek V4 Flash (Reasoning)	89%	$0.0005	21.4s	84%
Xiaomi MIMO v2.5 Pro	89%	$0.0081	44.9s	87%
GPT-5.4 Mini	88%	$0.015	16.7s	87%
GPT-5.4 Mini (Reasoning, Low)	89%	$0.014	16.8s	86%
Z.AI GLM 5.2 (Reasoning, High)	89%	$0.011	57.5s	86%
DeepSeek V4 Flash	88%	$0.0007	26.2s	84%
Z.AI GLM 5.1	89%	$0.020	1.8m	87%
Qwen 3.5 397B A17B	91%	$0.0049	3.3m	87%
Claude Sonnet 4.5	90%	$0.041	37.3s	85%
Aion 3.0	88%	$0.032	1.1m	86%
Claude Opus 4.8 (Reasoning, Low)	90%	$0.085	40.3s	88%
MiniMax M3	88%	$0.0049	2.2m	86%
Claude Sonnet 4.6 (Reasoning)	90%	$0.065	1.1m	86%

Rank	Model	Avg. Cost	Avg. Time	Stability	# 1	# 2	# 3	# 4	# 5	Total
2	GPT-5.4	$0.046	1.4m	91%	94	93	93	92	92	93%
7	GPT-5.4 (Reasoning, Low)	$0.050	1.2m	89%	93	93	92	90	89	91%
1	Qwen3 235B A22B Instruct 2507	$0.0014	59.9s	90%	92	92	91	91	90	91%
15	Qwen 3.5 397B A17B	$0.0049	3.3m	87%	93	93	91	90	87	91%
5	Z.AI GLM 5	$0.0075	44.3s	86%	94	92	90	89	88	91%
45	GPT-5.4 (Reasoning)	$0.109	3.0m	89%	92	91	90	90	89	91%
4	Z.AI GLM 5 Turbo	$0.0078	26.6s	86%	94	93	92	88	85	90%
18	Claude Opus 4.8 (Reasoning, Low)	$0.085	40.3s	88%	93	91	91	89	88	90%
20	Claude Sonnet 4.6 (Reasoning)	$0.065	1.1m	86%	93	92	91	90	86	90%
33	Claude Opus 4.6 (Reasoning)	$0.091	1.2m	85%	93	93	89	88	88	90%
44	GPT-5.5 (Reasoning)	$0.141	1.8m	90%	91	91	90	90	89	90%
3	Writer: Palmyra X5	$0.011	18.7s	87%	93	91	91	89	87	90%
6	GPT-5.4 Mini (Reasoning)	$0.023	28.1s	87%	93	91	90	89	88	90%
42	GPT-5.5 (Reasoning, Low)	$0.118	1.6m	87%	92	92	89	89	88	90%
43	GPT-5.5	$0.121	1.7m	87%	92	91	91	90	86	90%
21	Claude Sonnet 4.6	$0.036	40.3s	82%	95	94	89	87	84	90%
16	Claude Sonnet 4.5	$0.041	37.3s	85%	94	90	89	89	87	90%
27	GPT-5.1	$0.052	2.2m	86%	92	91	89	89	86	90%
8	DeepSeek V4 Flash (Reasoning)	$0.0005	21.4s	84%	93	92	89	88	85	89%
14	Z.AI GLM 5.1	$0.020	1.8m	87%	91	90	89	89	88	89%
40	Claude Opus 4.7	$0.085	31.4s	82%	94	93	90	89	81	89%
37	Claude Opus 4.6	$0.087	1.2m	84%	94	90	89	88	86	89%
11	GPT-5.4 Mini (Reasoning, Low)	$0.014	16.8s	86%	91	90	89	88	86	89%
22	Claude Sonnet 5	$0.033	33.7s	82%	94	92	89	88	82	89%
9	Xiaomi MIMO v2.5 Pro	$0.0081	44.9s	87%	91	91	90	87	87	89%
65	DeepSeek V4 Pro (Reasoning)	$0.026	5.5m	85%	92	91	90	87	84	89%
12	Z.AI GLM 5.2 (Reasoning, High)	$0.011	57.5s	86%	91	90	89	89	85	89%
50	Qwen3.6 Max Preview	$0.048	3.1m	84%	92	91	88	87	86	89%
17	Aion 3.0	$0.032	1.1m	86%	90	89	88	87	87	88%
10	GPT-5.4 Mini	$0.015	16.7s	87%	90	88	88	88	88	88%
19	MiniMax M3	$0.0049	2.2m	86%	90	89	88	88	86	88%
97	MoonshotAI: Kimi K2.6	$0.066	6.3m	85%	91	88	88	87	86	88%
13	DeepSeek V4 Flash	$0.0007	26.2s	84%	91	90	89	86	83	88%
41	Claude Opus 4.8 (Reasoning)	$0.086	41.8s	84%	90	89	88	87	84	88%
23	Aion 3.0 Mini	$0.0070	1.4m	83%	91	89	88	88	82	88%
32	Qwen 3.5 Plus (2026-04-20)	$0.017	1.6m	82%	92	91	90	82	81	87%
24	Qwen 3.6 Flash	$0.012	47.6s	81%	92	89	88	87	80	87%
34	Claude Opus 4.5	$0.069	43.9s	85%	89	88	88	86	84	87%
29	MiniMax M2.7	$0.0035	1.1m	80%	92	90	87	84	82	87%
39	Claude Sonnet 5 (Reasoning)	$0.040	42.1s	81%	91	90	88	85	80	87%
73	MoonshotAI: Kimi K2.5	$0.026	3.4m	79%	91	90	86	85	81	87%
132	Claude Opus 4	$0.292	1.9m	80%	91	89	88	85	78	86%
46	Claude Sonnet 5 (Reasoning, Low)	$0.039	39.4s	80%	93	87	87	85	80	86%
62	Claude Opus 4.7 (Reasoning)	$0.091	33.5s	82%	89	88	85	85	85	86%
25	Grok 4.20 (Reasoning)	$0.016	1.1m	83%	89	89	88	83	82	86%
49	Claude Sonnet 4	$0.038	44.7s	80%	91	90	89	84	77	86%
28	MiniMax M2.5	$0.0034	1.6m	83%	89	88	87	86	82	86%
84	GPT-5	$0.068	4.0m	83%	89	86	86	85	85	86%
30	Qwen 3.6 35B	$0.012	1.0m	81%	89	89	86	84	82	86%
31	Grok 4.20	$0.011	38.2s	80%	91	87	86	84	81	86%
61	DeepSeek V4 Pro	$0.0050	50.1s	75%	93	93	92	82	69	86%
26	Mistral Medium 3.1	$0.0059	40.6s	82%	88	87	85	84	84	86%
38	Grok 4.3	$0.0090	24.3s	78%	91	87	84	83	81	85%
92	Qwen3.7 Max	$0.085	2.5m	82%	87	84	84	83	83	84%
55	Hermes 3 405B	$0.0054	37.5s	77%	90	89	85	80	78	84%
35	GPT-5.4 Nano	$0.0048	16.7s	80%	87	87	84	82	81	84%
36	DeepSeek V3 (2025-03-24)	$0.0016	14.5s	80%	88	86	85	82	79	84%
109	Gemini 3.1 Pro (Preview)	$0.136	2.2m	81%	86	84	84	84	81	84%
67	Aion 2.0	$0.0076	1.4m	78%	88	85	84	82	78	84%
53	Z.AI GLM 4.5	$0.0039	28.5s	78%	88	86	84	84	76	84%
48	ByteDance Seed 1.6 Flash	$0.0012	24.5s	78%	87	85	85	85	76	84%
51	Z.AI GLM 4.7 Flash	$0.0017	1.2m	80%	87	84	83	83	80	84%
85	WizardLM 2 8x22b	$0.0040	2.8m	76%	90	83	82	82	79	83%
63	o4 Mini High	$0.021	44.5s	79%	87	86	84	80	78	83%
68	Claude Haiku 4.5	$0.014	22.0s	76%	87	87	83	82	76	83%
64	Qwen 3 32B	$0.0017	32.0s	76%	89	83	82	81	80	83%
52	Mistral Small 4	$0.0017	21.1s	78%	87	85	84	80	77	83%
69	Gemini 3.5 Flash (Reasoning, Minimal)	$0.018	11.8s	76%	89	83	82	81	78	83%
56	DeepSeek V3.2	$0.0020	1.3m	80%	84	84	82	82	80	83%
71	Grok 4.3 (Reasoning)	$0.016	1.3m	78%	87	86	85	78	77	83%
47	Qwen 3.5 Flash	$0.0022	51.7s	81%	85	83	83	82	81	83%
57	Qwen 3.5 122B	$0.017	37.1s	80%	84	84	82	82	81	83%
54	GPT-5.4 Nano (Reasoning)	$0.0056	21.4s	79%	87	83	83	80	79	82%
91	GPT-5.2	$0.059	1.8m	79%	85	84	83	81	79	82%
66	GPT-5 Mini	$0.010	1.0m	79%	85	83	82	82	80	82%
58	Mistral Small 4 (Reasoning)	$0.0028	30.6s	78%	86	84	83	82	77	82%
96	Qwen 3.5 9B	$0.0016	1.7m	73%	91	82	81	79	76	82%
60	Hermes 3 70B	$0.0015	21.9s	78%	85	85	84	78	76	82%
86	Gemini 2.5 Pro	$0.036	36.4s	75%	87	85	83	79	74	82%
88	Gemini 3.5 Flash (Reasoning)	$0.073	38.1s	79%	84	82	82	81	79	82%
79	o4 Mini	$0.017	29.4s	75%	86	85	81	79	77	82%
70	GPT-5.4 Nano (Reasoning, Low)	$0.0049	18.3s	76%	85	84	80	80	79	81%
74	Gemma 3 27B	$0.0008	51.3s	76%	86	85	83	79	74	81%
72	Cydonia 24B V4.1	$0.0018	37.9s	76%	86	84	81	79	77	81%
78	Xiaomi MIMO v2.5	$0.0052	29.9s	75%	85	85	79	79	78	81%
82	Z.AI GLM 4.7	$0.011	1.3m	77%	85	83	82	78	78	81%
107	Z.AI GLM 4.6	$0.0066	1.0m	69%	89	86	77	77	75	81%
90	Gemma 4 26B (Reasoning)	$0.0014	1.5m	74%	87	84	83	78	72	81%
59	Gemini 2.5 Flash	$0.0057	10.8s	79%	82	81	81	81	79	81%
77	Gemini 2.5 Flash (Reasoning)	$0.0099	17.3s	75%	85	83	80	78	78	81%
87	Z.AI GLM 4.5 Air	$0.0028	41.6s	73%	86	85	80	78	75	81%
76	Gemini 2.5 Flash Lite	$0.0009	8.1s	74%	86	84	81	81	73	81%
95	DeepSeek V3.1	$0.0028	2.4m	76%	84	81	80	79	78	81%
89	GPT-4.1	$0.019	55.8s	75%	85	83	80	79	77	81%
80	Cohere Command R+ (Aug. 2024)	$0.025	31.5s	77%	84	81	81	80	77	80%
75	Mistral Large 3	$0.0043	29.5s	76%	84	84	82	79	74	80%
81	Mistral Large 2	$0.017	30.1s	77%	83	81	80	80	76	80%
117	Qwen 3.5 27B	$0.044	2.9m	77%	84	81	81	78	76	80%
133	Qwen 3.6 27B	$0.036	3.4m	65%	91	83	82	82	60	80%
83	Qwen 3.5 35B	$0.0095	32.9s	76%	82	82	80	78	75	80%
100	ByteDance Seed 2.0 Lite	$0.012	2.1m	76%	83	80	79	78	77	79%
94	Qwen 3.5 Plus (2026-02-15)	$0.0068	28.0s	74%	84	79	78	77	76	79%
106	GPT-4o Mini (temp=1)	$0.0013	41.8s	70%	87	80	77	76	74	79%
102	DeepSeek V3 (2024-12-26)	$0.0027	1.2m	73%	83	82	77	76	76	79%
123	DeepSeek-V2 Chat	$0.0026	58.1s	66%	86	85	77	76	67	78%
93	GPT-4.1 Nano	$0.0008	14.4s	73%	82	81	78	77	73	78%
112	ByteDance Seed 1.6	$0.0098	1.7m	73%	83	78	77	77	74	78%
98	Arcee AI: Trinity Mini	$0.0004	7.3s	72%	83	80	78	77	72	78%
99	GPT-4.1 Mini	$0.0027	17.2s	72%	83	78	77	77	72	78%
108	Ministral 3 8B	$0.0028	1.5m	73%	82	80	79	74	72	77%
105	Ministral 3 14B	$0.0012	11.9s	70%	83	82	77	73	72	77%
116	Gemma 4 31B (Reasoning)	$0.0018	2.3m	74%	79	79	78	78	72	77%
129	ByteDance Seed 2.0 Mini	$0.0041	4.2m	71%	83	77	76	75	74	77%
110	GPT-4o, Aug. 6th (temp=1)	$0.020	16.4s	72%	82	77	76	75	74	77%
101	Ministral 3B	$0.0002	5.0s	72%	82	80	78	74	70	77%
103	GPT-4o Mini (temp=0)	$0.0014	54.5s	74%	79	78	76	76	74	77%
121	Qwen 2.5 72B	$0.0014	42.7s	70%	82	79	75	73	72	76%
104	Ministral 8B	$0.0006	8.3s	72%	81	77	76	75	72	76%
113	Gemma 3 4B	$0.0003	21.4s	70%	81	80	77	71	70	76%
122	Gemini 3 Flash (Preview, Reasoning)	$0.011	26.8s	69%	83	78	76	73	69	76%
115	Gemini 2.5 Flash Lite (Reasoning)	$0.0030	31.4s	71%	79	78	76	74	70	76%
126	Gemma 4 31B	$0.0015	1.8m	68%	83	75	74	73	72	75%
119	Gemma 4 26B	$0.0012	35.8s	71%	80	75	75	73	73	75%
114	Ministral 3 3B	$0.0005	3.0s	70%	79	77	76	75	67	75%
120	Gemma 3 12B	$0.0004	35.6s	71%	80	78	77	70	70	75%
127	Mistral NeMO	$0.0009	9.5s	64%	87	74	73	70	70	75%
118	GPT-4o, Aug. 6th (temp=0)	$0.022	15.4s	73%	77	77	76	73	72	75%
111	Llama 3.1 70B	$0.0020	22.2s	72%	77	76	75	75	72	75%
125	Gemini 3 Flash (Preview)	$0.0075	17.5s	69%	79	75	73	71	71	74%
124	Gemini 3.1 Flash Lite (Reasoning)	$0.0035	9.4s	69%	77	76	73	71	71	74%
131	Gemini 3.1 Flash Lite	$0.0034	13.9s	63%	80	75	69	69	68	72%
130	Gemini 3.1 Flash Lite (Preview)	$0.0039	9.2s	66%	77	71	71	70	67	71%
128	Nemotron 3 Super	$0.0000	46.2s	70%	72	72	72	69	69	71%
134	GPT-5 Nano	$0.0045	1.7m	68%	70	70	70	68	67	69%
135	GPT-OSS 120B	$0.0018	1.2m	64%	74	68	68	67	65	68%
137	Nemotron 3 Nano	$0.0010	42.3s	61%	74	68	66	64	62	67%
138	Mistral Small 3.2 24B	$0.012	11.5m	62%	72	68	67	67	60	67%
136	Inception Mercury 2	$0.0032	6.6s	61%	72	68	66	64	63	66%
82.56%

Median	Evaluator	Top 3	Flop 3
100.0%	"Not X but Y" pattern overuse	100o4 Mini High 100Gemma 4 31B 100GPT-5.4 Mini	43GPT-5 Nano 44Gemini 2.5 Flash Lite (Reasoning) 49Gemini 2.5 Flash
58.3%	Adverb-first sentence starts	100Claude Sonnet 5 100Writer: Palmyra X5 100Z.AI GLM 5	0ByteDance Seed 2.0 Lite 0GPT-OSS 120B 0Inception Mercury 2
100.0%	Adverbs in dialogue tags	100Ministral 3 8B 100GPT-4o, Aug. 6th (temp=0) 100Aion 3.0	44GPT-4o, Aug. 6th (temp=1) 59Aion 2.0 67GPT-4.1 Nano
94.8%	AI-ism adverb frequency	100GPT-5.4 (Reasoning) 100GPT-5.4 100Ministral 3 3B	75Qwen 3.6 27B 79GPT-4.1 Nano 80Cydonia 24B V4.1
100.0%	AI-ism character names	100Claude Sonnet 4.6 (Reasoning) 100o4 Mini 100Mistral Large 2	88Gemma 3 4B 92MiniMax M2.7 96Qwen 3.5 9B
100.0%	AI-ism location names	100Gemini 3 Flash (Preview) 100Gemini 3 Flash (Preview, Reasoning) 100Claude Sonnet 4	—
50.3%	AI-ism word frequency	85Claude Opus 4.7 (Reasoning) 84Claude Opus 4.8 (Reasoning) 83Claude Sonnet 5	0GPT-4o, Aug. 6th (temp=0) 0GPT-4o, Aug. 6th (temp=1) 0GPT-4.1 Mini
100.0%	Cliché density	100GPT-4.1 100MiniMax M2.7 100Claude Sonnet 4.6 (Reasoning)	27Mistral NeMO 33Qwen 2.5 72B 47Llama 3.1 70B
74.3%	Dialogue tag variety (said vs. fancy)	100Claude Opus 4.6 100MiniMax M2.5 100GPT-5.5 (Reasoning)	0GPT-OSS 120B 0Gemini 3 Flash (Preview, Reasoning) 0GPT-4o Mini (temp=0)
96.0%	Em-dash & semicolon overuse	100Grok 4.20 100Claude Sonnet 4.6 (Reasoning) 100MiniMax M2.7	0Mistral Small 4 0Gemma 3 4B 0Mistral Small 4 (Reasoning)
100.0%	Emotion telling (show vs. tell)	100Claude Opus 4.5 100Gemma 3 27B 100Gemini 3.5 Flash (Reasoning, Minimal)	85Mistral Small 3.2 24B 97Llama 3.1 70B 97Nemotron 3 Nano
97.4%	Filter word density	100Claude Sonnet 4.6 (Reasoning) 100Z.AI GLM 4.7 100o4 Mini High	35Nemotron 3 Nano 37Inception Mercury 2 40GPT-OSS 120B
100.0%	Gibberish response detection	100Aion 3.0 100Claude Opus 4.7 100Gemini 3 Flash (Preview)	80Nemotron 3 Nano 96Hermes 3 70B 97GPT-4o, Aug. 6th (temp=1)
100.0%	Markdown formatting overuse	100Gemini 2.5 Pro 100GPT-5.4 Mini 100DeepSeek V3.1	49Ministral 3B 80Ministral 3 3B 80Qwen 3 32B
100.0%	Missing dialogue indicators (quotation marks)	100Mistral Large 3 100Gemma 3 12B 100Gemini 3 Flash (Preview)	48GPT-5 69Gemini 2.5 Flash Lite (Reasoning) 87Xiaomi MIMO v2.5 Pro
77.8%	Name drop frequency	100Claude Sonnet 4.6 100Gemini 3.1 Flash Lite 100Gemini 3.1 Flash Lite (Reasoning)	7GPT-5.2 16Qwen 3.5 35B 17Qwen 3.5 27B
83.2%	Narrator intent-glossing	100o4 Mini High 100GPT-5.5 100ByteDance Seed 1.6	15GPT-5 Nano 24Gemini 2.5 Flash Lite (Reasoning) 31Gemma 4 26B
100.0%	Overuse of "that" (subordinate clause padding)	100GPT-5.4 Nano (Reasoning, Low) 100GPT-5.5 (Reasoning, Low) 100GPT-4o Mini (temp=1)	47Mistral Small 3.2 24B 82DeepSeek V4 Pro (Reasoning) 90Claude Sonnet 5 (Reasoning)
100.0%	Paragraph length variance	100Qwen 3.5 Plus (2026-04-20) 100Gemma 3 27B 100Claude Sonnet 4.6	33Gemini 2.5 Flash Lite (Reasoning) 33Mistral Small 3.2 24B 43Gemini 2.5 Flash Lite
98.6%	Passive voice overuse	100Qwen3 235B A22B Instruct 2507 100GPT-5.5 (Reasoning, Low) 100o4 Mini	77ByteDance Seed 2.0 Mini 88Mistral NeMO 91Gemini 2.5 Flash Lite
95.3%	Past progressive (was/were + -ing) overuse	100GPT-5.2 100GPT-5.5 100GPT-5 Mini	23Z.AI GLM 4.7 23Claude Opus 4.7 (Reasoning) 29Gemma 4 31B
93.0%	Pronoun-first sentence starts	100MiniMax M2.7 100Z.AI GLM 4.5 Air 100Z.AI GLM 5.2 (Reasoning, High)	1Mistral Small 3.2 24B 9Gemini 3.1 Flash Lite (Reasoning) 23Gemini 3.1 Flash Lite
97.6%	Purple prose (modifier overload)	100GPT-5.4 100Claude Opus 4 100Claude Opus 4.8 (Reasoning, Low)	80Gemini 3 Flash (Preview, Reasoning) 82Gemini 3.1 Pro (Preview) 84Gemini 3.5 Flash (Reasoning)
100.0%	Repeated phrase echo	100ByteDance Seed 1.6 100Gemma 3 27B 100Gemma 3 4B	—
100.0%	Sentence length variance	100GPT-5.4 Mini (Reasoning) 100DeepSeek V4 Pro 100Qwen 3.5 122B	79Mistral Small 3.2 24B 91Qwen 3.6 27B 91Llama 3.1 70B
54.8%	Sentence opener variety	91DeepSeek V3 (2025-03-24) 89Cydonia 24B V4.1 88Hermes 3 70B	25Mistral Small 3.2 24B 31Qwen 3.5 27B 33Mistral NeMO
39.9%	Subject-first sentence starts	99Qwen3 235B A22B Instruct 2507 96Writer: Palmyra X5 92Cydonia 24B V4.1	0Inception Mercury 2 0Qwen 3.5 27B 0Qwen 3.5 122B
25.5%	Subordinate conjunction sentence starts	78Cohere Command R+ (Aug. 2024) 78Gemini 2.5 Flash (Reasoning) 72Gemma 3 4B	0Nemotron 3 Super 0Claude Opus 4.8 (Reasoning) 0GPT-4o, Aug. 6th (temp=0)
79.8%	Technical jargon density	100Qwen3.7 Max 100Llama 3.1 70B 100DeepSeek V3 (2025-03-24)	8GPT-5 Nano 12Claude Haiku 4.5 19Claude Sonnet 5 (Reasoning)
67.6%	Useless dialogue additions	100Qwen 3.6 Flash 100Gemini 2.5 Flash Lite (Reasoning) 100Claude Sonnet 4.5	0GPT-4o Mini (temp=0) 0Nemotron 3 Super 0Gemini 3.1 Flash Lite (Preview)

Bad Writing Habits

Thriller: chase through city streets

Performance Score Distribution (Top 20)

Price-Performance Score Distribution (Top 20)

Most Stable Models (Top 20)

Top Overall Models (Top 20)