Thriller: chase through city streets

Bad Writing Habits

Detects common prose quality anti-patterns in AI-generated creative writing, including passive voice, past progressive overuse, weak dialogue tags, filter words, purple prose, cliches, AI-ism words/adverbs/names, and more.

Creative Writing Hallucination

Performance Score Distribution (Top 20)

Click a model name to view its detail page.

	Score
GPT-5.4 (Reasoning)	93%
GPT-5.4 (Reasoning, Low)	93%
GPT-5.4	92%
Grok 4.20 (Reasoning)	90%
GPT-5.4 Mini (Reasoning)	90%
GPT-5.5 (Reasoning, Low)	90%
GPT-5.5 (Reasoning)	90%
GPT-5.5	89%
GPT-5.4 Mini (Reasoning, Low)	88%
GPT-5.1	88%
Z.AI GLM 5.2 (Reasoning, High)	88%
Qwen3.6 Max Preview	88%
Writer: Palmyra X5	88%
Z.AI GLM 5 Turbo	87%
Claude Opus 4.8 (Reasoning)	87%
Qwen3 235B A22B Instruct 2507	87%
Gemini 3.1 Pro (Preview)	87%
MoonshotAI: Kimi K2.6	87%
Claude Opus 4.7	87%
GPT-5.4 Mini	87%

	Score	Cost	Time
Qwen3 235B A22B Instruct 2507	87%	$0.0006	49.7s
GPT-5.4 Mini (Reasoning)	90%	$0.020	29.8s
Writer: Palmyra X5	88%	$0.010	20.5s
Z.AI GLM 5 Turbo	87%	$0.0063	29.9s
GPT-5.4 Mini (Reasoning, Low)	88%	$0.014	16.1s
Mistral Small 4 (Reasoning)	84%	$0.0015	21.5s
Mistral Small 4	83%	$0.0009	12.8s
Claude Haiku 4.5	85%	$0.0089	18.4s
MiniMax M2.5	86%	$0.0026	46.1s
Grok 4.20 (Reasoning)	90%	$0.017	1.6m
Gemini 2.5 Flash (Reasoning)	83%	$0.0076	14.4s
Z.AI GLM 5.2 (Reasoning, High)	88%	$0.0083	51.0s
GPT-5.4 Mini	87%	$0.012	14.1s
Aion 3.0 Mini	84%	$0.0040	1.0m
Qwen 3.5 35B	85%	$0.0086	33.0s
Qwen 3.6 Flash	84%	$0.0092	37.1s
Aion 3.0	86%	$0.016	43.3s
Qwen 3.6 35B	85%	$0.0058	52.0s
Grok 4.20	83%	$0.0078	41.2s
Z.AI GLM 4.7	84%	$0.0095	1.5m

	Score	Consistency	Stability
GPT-5.4	92%	98%	90%
GPT-5.5 (Reasoning)	90%	99%	88%
Grok 4.20 (Reasoning)	90%	98%	88%
GPT-5.5	89%	98%	88%
GPT-5.4 (Reasoning)	93%	95%	88%
GPT-5.4 (Reasoning, Low)	93%	96%	87%
GPT-5.5 (Reasoning, Low)	90%	97%	87%
GPT-5.4 Mini (Reasoning)	90%	96%	86%
Gemini 3.1 Pro (Preview)	87%	98%	85%
GPT-5.4 Mini (Reasoning, Low)	88%	97%	85%
GPT-5.1	88%	95%	85%
Writer: Palmyra X5	88%	97%	84%
Qwen3.6 Max Preview	88%	96%	84%
MoonshotAI: Kimi K2.6	87%	97%	84%
Qwen3.7 Max	85%	98%	84%
GPT-5	85%	98%	84%
Qwen 3.6 27B	86%	97%	83%
GPT-5.4 Mini	87%	96%	83%
Claude Opus 4.8 (Reasoning, Low)	85%	97%	83%
Claude Opus 4.8 (Reasoning)	87%	94%	82%

	Score	Cost	Speed	Stability
GPT-5.4 Mini (Reasoning)	90%	$0.020	29.8s	86%
GPT-5.4 Mini (Reasoning, Low)	88%	$0.014	16.1s	85%
Writer: Palmyra X5	88%	$0.010	20.5s	84%
Grok 4.20 (Reasoning)	90%	$0.017	1.6m	88%
Z.AI GLM 5 Turbo	87%	$0.0063	29.9s	82%
GPT-5.4 Mini	87%	$0.012	14.1s	83%
GPT-5.4	92%	$0.045	1.4m	90%
Qwen3 235B A22B Instruct 2507	87%	$0.0006	49.7s	82%
GPT-5.4 (Reasoning, Low)	93%	$0.046	1.2m	87%
Qwen 3.5 35B	85%	$0.0086	33.0s	82%
Claude Haiku 4.5	85%	$0.0089	18.4s	81%
Z.AI GLM 5.2 (Reasoning, High)	88%	$0.0083	51.0s	80%
MiniMax M2.5	86%	$0.0026	46.1s	80%
Mistral Small 4 (Reasoning)	84%	$0.0015	21.5s	80%
Qwen 3.6 35B	85%	$0.0058	52.0s	81%
Mistral Small 4	83%	$0.0009	12.8s	78%
Aion 3.0	86%	$0.016	43.3s	79%
Qwen 3.6 Flash	84%	$0.0092	37.1s	80%
Gemini 2.5 Flash Lite	83%	$0.0007	9.0s	77%
Claude Sonnet 5	85%	$0.023	33.0s	81%

Rank	Model	Avg. Cost	Avg. Time	Stability	# 1	# 2	# 3	# 4	# 5	Total
67	GPT-5.4 (Reasoning)	$0.083	2.5m	88%	95	95	92	92	89	93%
9	GPT-5.4 (Reasoning, Low)	$0.046	1.2m	87%	96	94	91	91	91	93%
7	GPT-5.4	$0.045	1.4m	90%	93	93	92	91	91	92%
4	Grok 4.20 (Reasoning)	$0.017	1.6m	88%	91	91	90	89	89	90%
1	GPT-5.4 Mini (Reasoning)	$0.020	29.8s	86%	92	92	90	89	86	90%
116	GPT-5.5 (Reasoning, Low)	$0.127	1.7m	87%	91	91	90	89	87	90%
118	GPT-5.5 (Reasoning)	$0.129	1.8m	88%	91	90	89	89	89	90%
112	GPT-5.5	$0.123	1.7m	88%	90	89	89	88	87	89%
2	GPT-5.4 Mini (Reasoning, Low)	$0.014	16.1s	85%	91	88	88	87	87	88%
56	GPT-5.1	$0.047	2.2m	85%	91	90	89	88	83	88%
12	Z.AI GLM 5.2 (Reasoning, High)	$0.0083	51.0s	80%	94	90	86	85	84	88%
101	Qwen3.6 Max Preview	$0.053	3.6m	84%	92	88	88	87	86	88%
3	Writer: Palmyra X5	$0.010	20.5s	84%	90	89	87	87	85	88%
5	Z.AI GLM 5 Turbo	$0.0063	29.9s	82%	90	89	87	86	82	87%
55	Claude Opus 4.8 (Reasoning)	$0.060	40.7s	82%	89	89	88	87	81	87%
8	Qwen3 235B A22B Instruct 2507	$0.0006	49.7s	82%	90	90	87	86	81	87%
117	Gemini 3.1 Pro (Preview)	$0.105	1.7m	85%	88	87	87	86	86	87%
119	MoonshotAI: Kimi K2.6	$0.049	4.2m	84%	89	88	87	85	84	87%
53	Claude Opus 4.7	$0.059	29.8s	82%	90	89	86	85	82	87%
6	GPT-5.4 Mini	$0.012	14.1s	83%	90	87	86	85	84	87%
93	Qwen 3.5 397B A17B	$0.021	3.3m	79%	91	90	85	83	83	86%
17	Aion 3.0	$0.016	43.3s	79%	92	88	87	85	78	86%
13	MiniMax M2.5	$0.0026	46.1s	80%	91	88	86	85	81	86%
23	Qwen 3.6 27B	$0.017	1.7m	83%	88	87	86	86	84	86%
24	Z.AI GLM 5	$0.0062	1.3m	78%	91	88	85	85	80	86%
10	Qwen 3.5 35B	$0.0086	33.0s	82%	88	86	85	85	83	85%
20	Claude Sonnet 5	$0.023	33.0s	81%	89	86	85	85	81	85%
84	Claude Opus 4.7 (Reasoning)	$0.063	30.8s	79%	90	90	87	81	78	85%
103	Qwen3.7 Max	$0.064	2.6m	84%	86	86	85	85	84	85%
72	Claude Opus 4.8 (Reasoning, Low)	$0.064	43.4s	83%	87	86	85	85	83	85%
11	Claude Haiku 4.5	$0.0089	18.4s	81%	88	88	86	83	81	85%
125	GPT-5	$0.066	3.3m	84%	86	86	86	85	83	85%
25	Claude Sonnet 4.5	$0.028	34.3s	82%	87	86	84	83	83	85%
52	MiniMax M3	$0.0046	2.6m	81%	88	87	87	82	80	85%
90	Gemini 3.5 Flash (Reasoning)	$0.057	31.1s	77%	91	88	87	83	74	85%
130	Claude Opus 4	$0.123	52.2s	81%	89	85	84	83	82	85%
15	Qwen 3.6 35B	$0.0058	52.0s	81%	87	86	84	84	83	85%
110	Claude Opus 4.6	$0.073	1.3m	79%	89	86	85	83	78	85%
18	Qwen 3.6 Flash	$0.0092	37.1s	80%	87	87	85	83	79	84%
26	Qwen 3.5 122B	$0.015	41.8s	79%	88	84	83	83	82	84%
29	Qwen 3.5 9B	$0.0012	1.5m	79%	87	86	83	82	82	84%
66	Z.AI GLM 4.6	$0.0056	1.1m	71%	93	90	81	80	77	84%
43	Claude Sonnet 4.6	$0.024	36.1s	78%	90	86	85	82	78	84%
45	Z.AI GLM 5.1	$0.0096	1.4m	78%	90	83	83	83	82	84%
33	Aion 2.0	$0.0047	1.1m	78%	88	87	84	82	79	84%
107	Grok 4.3 (Reasoning)	$0.027	3.1m	78%	89	86	86	81	77	84%
41	Z.AI GLM 4.7	$0.0095	1.5m	80%	87	86	85	81	80	84%
14	Mistral Small 4 (Reasoning)	$0.0015	21.5s	80%	87	86	85	82	79	84%
22	Aion 3.0 Mini	$0.0040	1.0m	80%	87	86	86	82	77	84%
39	DeepSeek V4 Pro	$0.0035	1.7m	80%	86	85	84	83	80	83%
133	DeepSeek V4 Pro (Reasoning)	$0.015	5.0m	73%	90	88	81	80	77	83%
30	Z.AI GLM 4.7 Flash	$0.0017	1.3m	80%	86	83	83	82	82	83%
38	GPT-4.1	$0.016	51.3s	80%	85	85	83	83	79	83%
16	Mistral Small 4	$0.0009	12.8s	78%	88	85	84	79	78	83%
40	Gemini 2.5 Flash (Reasoning)	$0.0076	14.4s	74%	89	86	85	83	71	83%
28	Grok 4.20	$0.0078	41.2s	79%	86	86	84	79	78	83%
50	o4 Mini High	$0.025	47.6s	80%	84	84	84	83	78	83%
61	Gemma 4 26B (Reasoning)	$0.0019	1.7m	77%	89	84	83	79	78	83%
44	Xiaomi MIMO v2.5 Pro	$0.0068	45.2s	76%	88	84	84	83	75	83%
104	Qwen 3.5 Plus (2026-04-20)	$0.020	2.1m	73%	90	87	81	79	77	83%
19	Gemini 2.5 Flash Lite	$0.0007	9.0s	77%	87	85	81	81	79	83%
68	Qwen 3.5 27B	$0.013	1.1m	75%	89	84	81	80	79	83%
83	ByteDance Seed 2.0 Lite	$0.011	2.2m	78%	87	85	84	80	77	82%
32	MiniMax M2.7	$0.0028	1.1m	79%	86	83	83	81	80	82%
31	Grok 4.3	$0.0056	32.1s	77%	87	83	81	81	80	82%
37	Qwen 3.5 Flash	$0.0017	40.7s	76%	87	84	81	81	78	82%
105	MoonshotAI: Kimi K2.5	$0.024	3.1m	80%	84	83	83	83	79	82%
111	Claude Opus 4.6 (Reasoning)	$0.071	1.2m	81%	83	83	83	82	80	82%
21	Gemma 4 26B	$0.0007	28.4s	78%	85	84	82	80	79	82%
91	Gemma 4 31B (Reasoning)	$0.0013	2.7m	78%	86	83	82	81	78	82%
35	Gemma 3 27B	$0.0005	43.7s	77%	86	83	81	81	79	82%
120	Claude Opus 4.5	$0.070	1.0m	78%	85	84	81	81	79	82%
63	Gemini 2.5 Pro	$0.030	31.4s	79%	85	83	83	79	78	82%
34	Mistral Medium 3.1	$0.0041	30.6s	78%	85	84	83	81	76	82%
75	Claude Sonnet 5 (Reasoning)	$0.024	34.1s	77%	85	83	81	80	77	81%
46	Mistral Large 2	$0.0088	23.3s	76%	85	84	81	79	77	81%
27	GPT-5.4 Nano (Reasoning, Low)	$0.0055	22.0s	79%	83	81	81	81	80	81%
76	DeepSeek V3.1	$0.0017	1.3m	74%	86	83	80	78	77	81%
58	Gemini 3.5 Flash (Reasoning, Minimal)	$0.015	10.8s	75%	85	84	80	79	77	81%
47	GPT-5.4 Nano (Reasoning)	$0.0063	27.0s	76%	86	82	80	79	78	81%
85	WizardLM 2 8x22b	$0.0017	1.9m	76%	85	83	81	80	75	81%
64	GPT-5 Mini	$0.0095	1.0m	77%	83	83	81	81	76	81%
51	Mistral Large 3	$0.0022	21.9s	74%	85	84	79	79	77	81%
74	ByteDance Seed 1.6 Flash	$0.0011	23.2s	71%	90	80	78	77	77	80%
54	o4 Mini	$0.014	27.2s	78%	82	82	81	79	78	80%
81	DeepSeek-V2 Chat	$0.0015	48.1s	71%	87	84	80	79	72	80%
62	Qwen 3.5 Plus (2026-02-15)	$0.0058	31.6s	75%	84	83	80	80	74	80%
60	DeepSeek V4 Flash	$0.0005	24.0s	73%	86	83	80	77	74	80%
57	DeepSeek V4 Flash (Reasoning)	$0.0005	24.9s	74%	86	82	82	77	73	80%
42	Gemini 3.1 Flash Lite (Reasoning)	$0.0025	9.0s	76%	84	80	80	77	77	80%
88	DeepSeek V3.2	$0.0009	1.3m	73%	86	82	80	78	74	80%
94	Gemma 4 31B	$0.0008	1.7m	75%	84	79	79	78	78	80%
126	GPT-5.2	$0.047	1.4m	75%	84	80	79	77	77	79%
95	Claude Sonnet 5 (Reasoning, Low)	$0.025	35.5s	75%	84	83	83	75	73	79%
36	Gemini 3.1 Flash Lite (Preview)	$0.0023	6.9s	78%	81	81	80	78	77	79%
79	Xiaomi MIMO v2.5	$0.0046	28.2s	72%	87	80	79	76	75	79%
71	DeepSeek V3 (2025-03-24)	$0.0010	1.1m	77%	82	81	81	76	76	79%
48	GPT-4.1 Nano	$0.0006	10.8s	76%	82	81	80	78	74	79%
49	Ministral 3 14B	$0.0004	8.9s	75%	82	81	81	77	73	79%
86	Gemini 2.5 Flash	$0.0037	8.5s	69%	86	83	79	79	68	79%
70	Gemma 3 12B	$0.0003	41.6s	75%	82	81	79	77	75	79%
65	Cydonia 24B V4.1	$0.0009	33.7s	75%	81	80	79	78	75	79%
69	Ministral 8B	$0.0002	8.6s	72%	83	80	77	76	75	78%
102	GPT-5.4 Nano	$0.0061	1.9m	75%	81	80	78	76	76	78%
106	DeepSeek V3 (2024-12-26)	$0.0018	1.4m	70%	86	78	76	76	76	78%
59	Ministral 3 3B	$0.0002	3.3s	74%	82	79	78	77	74	78%
82	Qwen 3 32B	$0.0012	47.9s	74%	82	80	78	77	74	78%
77	Gemini 3.1 Flash Lite	$0.0027	17.6s	73%	82	81	78	77	72	78%
124	Claude Sonnet 4.6 (Reasoning)	$0.030	41.4s	69%	87	78	77	75	73	78%
80	GPT-4.1 Mini	$0.0024	22.4s	73%	83	78	77	75	75	78%
113	Hermes 3 70B	$0.0009	2.0m	72%	84	79	78	76	72	78%
109	Cohere Command R+ (Aug. 2024)	$0.016	45.2s	70%	83	82	77	76	71	78%
99	GPT-4o, Aug. 6th (temp=1)	$0.017	20.8s	73%	82	80	77	75	74	78%
98	Z.AI GLM 4.5	$0.0043	47.5s	72%	84	78	77	75	74	78%
97	Gemini 2.5 Flash Lite (Reasoning)	$0.0019	19.2s	69%	88	79	79	71	71	77%
78	Ministral 3 8B	$0.0004	8.5s	72%	83	79	77	74	74	77%
108	GPT-4o, Aug. 6th (temp=0)	$0.017	22.3s	70%	85	78	77	74	72	77%
87	Arcee AI: Trinity Mini	$0.0003	8.6s	71%	82	79	78	73	71	77%
73	Mistral NeMO	$0.0004	9.2s	75%	78	78	77	76	74	77%
114	Claude Sonnet 4	$0.024	36.4s	72%	80	78	77	76	72	76%
92	Ministral 3B	$0.0001	2.7s	70%	82	78	75	74	74	76%
138	Mistral Small 3.2 24B	$0.0042	6.3m	73%	80	79	78	74	71	76%
96	GPT-4o Mini (temp=1)	$0.0010	35.8s	72%	79	78	75	74	74	76%
115	Z.AI GLM 4.5 Air	$0.0021	59.3s	69%	81	80	75	73	72	76%
89	Gemma 3 4B	$0.0002	21.6s	73%	80	76	75	75	75	76%
132	ByteDance Seed 1.6	$0.014	2.7m	71%	80	76	75	73	73	76%
100	Hermes 3 405B	$0.0018	23.8s	71%	79	78	76	73	70	75%
122	Gemini 3 Flash (Preview)	$0.011	29.5s	68%	83	76	75	73	69	75%
121	Gemini 3 Flash (Preview, Reasoning)	$0.014	34.4s	70%	78	77	76	75	67	75%
137	ByteDance Seed 2.0 Mini	$0.0043	4.8m	68%	80	79	75	73	66	75%
123	Qwen 2.5 72B	$0.0008	45.0s	68%	80	76	73	72	70	74%
127	GPT-4o Mini (temp=0)	$0.0010	36.0s	67%	79	74	72	71	67	73%
129	Llama 3.1 70B	$0.0008	25.0s	65%	79	77	72	69	65	72%
131	GPT-5 Nano	$0.0037	1.3m	66%	77	74	71	70	67	72%
134	Nemotron 3 Super	$0.0000	2.4m	68%	75	74	73	68	65	71%
128	Inception Mercury 2	$0.0023	5.7s	68%	71	70	70	69	68	69%
135	GPT-OSS 120B	$0.0014	1.6m	65%	72	71	69	67	64	68%
136	Nemotron 3 Nano	$0.0007	40.2s	56%	82	69	65	64	61	68%
81.58%

Median	Evaluator	Top 3	Flop 3
100.0%	"Not X but Y" pattern overuse	100Z.AI GLM 4.7 100Mistral Small 3.2 24B 100Aion 3.0	50DeepSeek V3.2 60DeepSeek V4 Pro (Reasoning) 66Claude Opus 4.6
50.3%	Adverb-first sentence starts	100Claude Opus 4.7 98Writer: Palmyra X5 97Ministral 3 14B	0Gemini 3.1 Pro (Preview) 0GPT-OSS 120B 0Hermes 3 405B
100.0%	Adverbs in dialogue tags	100Gemini 2.5 Flash Lite 100GPT-5.4 Mini 100GPT-4o Mini (temp=0)	60Hermes 3 405B 66GPT-4.1 Nano 67DeepSeek V3 (2025-03-24)
94.5%	AI-ism adverb frequency	100GPT-5.5 100GPT-5.5 (Reasoning, Low) 100GPT-5.4 (Reasoning)	77GPT-4.1 Nano 77Cydonia 24B V4.1 82Hermes 3 70B
100.0%	AI-ism character names	100Gemma 4 26B (Reasoning) 100GPT-5 Mini 100ByteDance Seed 2.0 Lite	92Gemma 3 12B 92Gemma 3 4B 96MiniMax M2.7
100.0%	AI-ism location names	100Qwen 3.6 27B 100Gemma 4 26B 100Qwen 3.6 35B	—
45.3%	AI-ism word frequency	86GPT-5.4 (Reasoning) 82GPT-5.5 (Reasoning) 82Claude Sonnet 4.6	0GPT-4o, Aug. 6th (temp=1) 0Gemini 2.5 Flash Lite (Reasoning) 0GPT-4o, Aug. 6th (temp=0)
100.0%	Cliché density	100GPT-5.4 Mini (Reasoning) 100Grok 4.20 (Reasoning) 100Qwen 3.6 35B	20Qwen 2.5 72B 40Mistral Small 3.2 24B 47GPT-4o, Aug. 6th (temp=0)
80.0%	Dialogue tag variety (said vs. fancy)	100ByteDance Seed 2.0 Lite 100Claude Opus 4.6 (Reasoning) 100Gemini 3.1 Flash Lite (Reasoning)	0Hermes 3 70B 0GPT-OSS 120B 0GPT-4o Mini (temp=1)
31.6%	Em-dash & semicolon overuse	100Qwen 3.6 27B 100Qwen 3.5 Flash 100Qwen 3.5 122B	0GPT-OSS 120B 0GPT-4o, Aug. 6th (temp=1) 0Claude Sonnet 4.6 (Reasoning)
100.0%	Emotion telling (show vs. tell)	100GPT-4.1 Nano 100GPT-5.4 100Xiaomi MIMO v2.5 Pro	87Llama 3.1 70B 90GPT-4.1 Mini 92Hermes 3 70B
96.2%	Filter word density	100Qwen3 235B A22B Instruct 2507 100GPT-5.5 (Reasoning, Low) 100GPT-5.4	21Claude Sonnet 4 42GPT-OSS 120B 43Gemini 3.1 Flash Lite (Preview)
100.0%	Gibberish response detection	100GPT-5.4 Nano 100Qwen 3.5 Flash 100Nemotron 3 Super	99ByteDance Seed 1.6 Flash 99MiniMax M2.5 100Z.AI GLM 4.7
100.0%	Markdown formatting overuse	100Mistral Small 4 100Z.AI GLM 5 100Qwen 3.6 Flash	60Ministral 3B 80Ministral 3 8B 82Ministral 3 14B
100.0%	Missing dialogue indicators (quotation marks)	100Claude Sonnet 5 100Llama 3.1 70B 100MoonshotAI: Kimi K2.6	80Qwen 3.6 Flash 88GPT-4o, Aug. 6th (temp=1) 93Aion 3.0
77.3%	Name drop frequency	100Gemini 3.1 Flash Lite (Reasoning) 100Gemini 3.1 Flash Lite 100Claude Opus 4.7 (Reasoning)	24GPT-5.4 Nano 26GPT-5.2 27GPT-5.4 Mini
86.4%	Narrator intent-glossing	100ByteDance Seed 2.0 Lite 100Qwen 3.6 27B 100GPT-5.5 (Reasoning, Low)	16GPT-5 Nano 27Claude Sonnet 4 31Claude Opus 4.5
100.0%	Overuse of "that" (subordinate clause padding)	100Qwen 3.5 9B 100DeepSeek V4 Flash 100GPT-OSS 120B	76ByteDance Seed 2.0 Lite 79Llama 3.1 70B 80Gemini 2.5 Flash
100.0%	Paragraph length variance	100MoonshotAI: Kimi K2.6 100MoonshotAI: Kimi K2.5 100Claude Opus 4.5	47Mistral Small 3.2 24B 52Hermes 3 405B 52Gemini 2.5 Flash Lite (Reasoning)
98.2%	Passive voice overuse	100GPT-5.4 (Reasoning) 100Nemotron 3 Nano 100GPT-5.4 Mini (Reasoning, Low)	90Claude Sonnet 4.6 90ByteDance Seed 1.6 91Z.AI GLM 5.2 (Reasoning, High)
97.5%	Past progressive (was/were + -ing) overuse	100GPT-5 Mini 100GPT-4.1 100MiniMax M2.5	52Gemini 3 Flash (Preview) 53Gemini 3 Flash (Preview, Reasoning) 54Z.AI GLM 5.1
88.2%	Pronoun-first sentence starts	100Mistral Medium 3.1 100Z.AI GLM 5.2 (Reasoning, High) 100Mistral Large 2	33Mistral NeMO 39Qwen 3.5 35B 41ByteDance Seed 1.6
98.6%	Purple prose (modifier overload)	100Claude Opus 4.8 (Reasoning) 100GPT-5.4 Mini (Reasoning, Low) 100Llama 3.1 70B	88Gemini 3.5 Flash (Reasoning) 92Gemini 3.1 Pro (Preview) 93Gemini 3 Flash (Preview)
100.0%	Repeated phrase echo	100ByteDance Seed 2.0 Mini 100DeepSeek V3.2 100Gemini 3.1 Flash Lite (Preview)	—
100.0%	Sentence length variance	100Claude Sonnet 4.6 100Cohere Command R+ (Aug. 2024) 100Gemini 3.5 Flash (Reasoning)	94Nemotron 3 Nano 94Nemotron 3 Super 95Hermes 3 405B
52.1%	Sentence opener variety	88GPT-4o, Aug. 6th (temp=1) 83Claude Sonnet 5 82Claude Opus 4	32Mistral Small 3.2 24B 33Gemma 4 26B (Reasoning) 35GPT-5 Nano
31.9%	Subject-first sentence starts	89Qwen3 235B A22B Instruct 2507 88Writer: Palmyra X5 88Z.AI GLM 5	0GPT-OSS 120B 0Inception Mercury 2 1Qwen 3.5 122B
28.4%	Subordinate conjunction sentence starts	93Cydonia 24B V4.1 88Z.AI GLM 4.7 Flash 73Gemini 3.5 Flash (Reasoning)	0Mistral Large 2 0Mistral Small 4 0DeepSeek V4 Pro
81.9%	Technical jargon density	100GPT-5.4 (Reasoning) 100Qwen 3.5 122B 100GPT-5.4 Mini (Reasoning)	6GPT-5 Nano 15Claude Sonnet 5 (Reasoning) 15ByteDance Seed 1.6
69.4%	Useless dialogue additions	100Gemini 3.1 Flash Lite (Preview) 100Grok 4.20 (Reasoning) 100Z.AI GLM 5.2 (Reasoning, High)	0Inception Mercury 2 0Qwen 2.5 72B 0Arcee AI: Trinity Mini

Bad Writing Habits

Thriller: chase through city streets

Performance Score Distribution (Top 20)

Price-Performance Score Distribution (Top 20)

Most Stable Models (Top 20)

Top Overall Models (Top 20)