Horror: alone in an eerie place at night

Bad Writing Habits

Detects common prose quality anti-patterns in AI-generated creative writing, including passive voice, past progressive overuse, weak dialogue tags, filter words, purple prose, cliches, AI-ism words/adverbs/names, and more.

Creative Writing Hallucination

Performance Score Distribution (Top 20)

Click a model name to view its detail page.

	Score
GPT-5.4 (Reasoning)	93%
GPT-5.5 (Reasoning)	92%
Z.AI GLM 5.1	91%
GPT-5.5	91%
Grok 4.5 (Reasoning, Low)	91%
GPT-5.5 (Reasoning, Low)	91%
Claude Sonnet 4.6 (Reasoning)	90%
GPT-5.4 (Reasoning, Low)	90%
GPT-5.4	90%
GPT-5.4 Mini (Reasoning)	90%
Grok 4.5 (Reasoning, High)	90%
Claude Sonnet 5 (Reasoning, Low)	89%
GPT-5.1	89%
Claude Opus 4.7 (Reasoning)	89%
GPT-5	89%
Claude Sonnet 4.5	89%
Claude Opus 4.8 (Reasoning, Low)	89%
Aion 3.0	89%
Claude Sonnet 4.6	88%
Claude Sonnet 5	88%

	Score	Cost	Time
GPT-5.4 Mini (Reasoning)	90%	$0.017	22.1s
Z.AI GLM 5 Turbo	88%	$0.0072	27.5s
Grok 4.5 (Reasoning, Low)	91%	$0.017	44.0s
Z.AI GLM 5.1	91%	$0.014	1.2m
GPT-5.4 Nano (Reasoning, Low)	85%	$0.0045	19.4s
DeepSeek V4 Flash (Reasoning)	86%	$0.0006	28.2s
GPT-5.4 Mini (Reasoning, Low)	88%	$0.013	14.9s
Z.AI GLM 5	88%	$0.0075	50.7s
Qwen3 235B A22B Instruct 2507	87%	$0.0015	48.5s
GPT-5.4 Mini	86%	$0.014	15.8s
Aion 3.0 Mini	88%	$0.0062	1.1m
Aion 3.0	89%	$0.031	1.0m
Grok 4.5 (Reasoning, High)	90%	$0.030	1.4m
GPT-5.4	90%	$0.039	1.1m
Writer: Palmyra X5	85%	$0.012	19.1s
Claude Sonnet 5 (Reasoning, Low)	89%	$0.038	41.0s
DeepSeek V4 Pro	88%	$0.0053	55.2s
Claude Sonnet 4.5	89%	$0.040	37.3s
Claude Sonnet 5	88%	$0.033	32.8s
Mistral Large 3	83%	$0.0037	29.1s

	Score	Consistency	Stability
GPT-5.4 (Reasoning)	93%	98%	91%
Z.AI GLM 5.1	91%	99%	90%
GPT-5.5 (Reasoning)	92%	97%	89%
GPT-5.5	91%	97%	88%
GPT-5.4	90%	97%	87%
Grok 4.5 (Reasoning, High)	90%	96%	87%
Grok 4.5 (Reasoning, Low)	91%	95%	87%
Claude Sonnet 5 (Reasoning, Low)	89%	97%	87%
GPT-5.4 Mini (Reasoning)	90%	95%	87%
GPT-5.5 (Reasoning, Low)	91%	95%	87%
GPT-5.4 (Reasoning, Low)	90%	95%	86%
Claude Opus 4.7 (Reasoning)	89%	97%	86%
MoonshotAI: Kimi K2.6	88%	97%	86%
Z.AI GLM 5 Turbo	88%	95%	86%
Claude Opus 4.6 (Reasoning)	88%	97%	86%
Claude Sonnet 4.6 (Reasoning)	90%	95%	85%
GPT-5	89%	96%	85%
DeepSeek V4 Pro (Reasoning)	87%	97%	85%
Qwen 3.5 Plus (2026-04-20)	86%	97%	84%
Aion 3.0	89%	94%	84%

	Score	Cost	Speed	Stability
Z.AI GLM 5.1	91%	$0.014	1.2m	90%
Grok 4.5 (Reasoning, Low)	91%	$0.017	44.0s	87%
GPT-5.4 Mini (Reasoning)	90%	$0.017	22.1s	87%
Z.AI GLM 5 Turbo	88%	$0.0072	27.5s	86%
GPT-5.4 Mini (Reasoning, Low)	88%	$0.013	14.9s	84%
Claude Sonnet 5 (Reasoning, Low)	89%	$0.038	41.0s	87%
Grok 4.5 (Reasoning, High)	90%	$0.030	1.4m	87%
GPT-5.4	90%	$0.039	1.1m	87%
Aion 3.0 Mini	88%	$0.0062	1.1m	84%
DeepSeek V4 Pro	88%	$0.0053	55.2s	84%
GPT-5.4 (Reasoning)	93%	$0.075	2.1m	91%
GPT-5.4 (Reasoning, Low)	90%	$0.048	1.1m	86%
Aion 3.0	89%	$0.031	1.0m	84%
DeepSeek V4 Flash (Reasoning)	86%	$0.0006	28.2s	81%
Claude Sonnet 4.5	89%	$0.040	37.3s	83%
Qwen 3.5 397B A17B	87%	$0.012	1.3m	83%
DeepSeek V4 Pro (Reasoning)	87%	$0.0092	2.0m	85%
GPT-5.4 Mini	86%	$0.014	15.8s	81%
Claude Sonnet 4.6	88%	$0.034	35.1s	82%
Z.AI GLM 5.2 (Reasoning, High)	87%	$0.0093	50.3s	82%

Rank	Model	Avg. Cost	Avg. Time	Stability	# 1	# 2	# 3	# 4	# 5	Total
11	GPT-5.4 (Reasoning)	$0.075	2.1m	91%	94	93	93	92	92	93%
39	GPT-5.5 (Reasoning)	$0.123	1.5m	89%	94	93	92	91	89	92%
1	Z.AI GLM 5.1	$0.014	1.2m	90%	92	92	91	91	91	91%
48	GPT-5.5	$0.116	1.5m	88%	94	92	91	90	89	91%
2	Grok 4.5 (Reasoning, Low)	$0.017	44.0s	87%	94	92	91	90	87	91%
59	GPT-5.5 (Reasoning, Low)	$0.120	1.5m	87%	94	92	91	89	88	91%
29	Claude Sonnet 4.6 (Reasoning)	$0.074	1.2m	85%	95	92	90	88	87	90%
12	GPT-5.4 (Reasoning, Low)	$0.048	1.1m	86%	92	92	91	89	86	90%
8	GPT-5.4	$0.039	1.1m	87%	92	91	89	89	89	90%
3	GPT-5.4 Mini (Reasoning)	$0.017	22.1s	87%	92	92	91	88	86	90%
7	Grok 4.5 (Reasoning, High)	$0.030	1.4m	87%	92	91	91	89	86	90%
6	Claude Sonnet 5 (Reasoning, Low)	$0.038	41.0s	87%	91	91	89	89	87	89%
36	GPT-5.1	$0.046	2.1m	84%	94	92	89	88	85	89%
28	Claude Opus 4.7 (Reasoning)	$0.086	29.8s	86%	92	90	89	88	88	89%
72	GPT-5	$0.061	3.9m	85%	92	91	89	88	87	89%
15	Claude Sonnet 4.5	$0.040	37.3s	83%	92	92	89	89	83	89%
52	Claude Opus 4.8 (Reasoning, Low)	$0.082	39.3s	82%	94	90	87	87	86	89%
13	Aion 3.0	$0.031	1.0m	84%	91	90	90	89	83	89%
19	Claude Sonnet 4.6	$0.034	35.1s	82%	94	89	87	86	85	88%
21	Claude Sonnet 5	$0.033	32.8s	82%	92	92	88	85	84	88%
9	Aion 3.0 Mini	$0.0062	1.1m	84%	92	90	88	86	85	88%
5	GPT-5.4 Mini (Reasoning, Low)	$0.013	14.9s	84%	92	89	87	87	87	88%
25	Z.AI GLM 5	$0.0075	50.7s	79%	93	93	90	88	77	88%
4	Z.AI GLM 5 Turbo	$0.0072	27.5s	86%	90	90	90	86	85	88%
109	MoonshotAI: Kimi K2.6	$0.063	6.5m	86%	90	89	88	87	86	88%
56	Claude Opus 4.8 (Reasoning)	$0.083	39.0s	82%	93	90	89	86	81	88%
53	Claude Opus 4.6 (Reasoning)	$0.086	1.2m	86%	90	89	88	87	86	88%
10	DeepSeek V4 Pro	$0.0053	55.2s	84%	91	90	89	85	84	88%
37	Claude Opus 4.5	$0.067	41.5s	84%	91	88	88	87	83	87%
17	DeepSeek V4 Pro (Reasoning)	$0.0092	2.0m	85%	89	89	88	86	85	87%
50	Claude Opus 4.7	$0.082	30.0s	83%	90	90	87	87	83	87%
79	Qwen3.6 Max Preview	$0.051	3.4m	82%	91	88	87	86	83	87%
16	Qwen 3.5 397B A17B	$0.012	1.3m	83%	89	89	86	86	85	87%
51	Claude Sonnet 4	$0.037	42.9s	78%	95	90	87	83	80	87%
24	MiniMax M2.7	$0.0029	1.0m	80%	92	89	86	84	83	87%
22	Qwen3 235B A22B Instruct 2507	$0.0015	48.5s	80%	93	88	86	85	82	87%
20	Z.AI GLM 5.2 (Reasoning, High)	$0.0093	50.3s	82%	90	89	87	86	82	87%
133	Claude Opus 4	$0.254	1.5m	83%	90	88	87	85	84	87%
32	Qwen 3.5 Plus (2026-04-20)	$0.020	2.0m	84%	88	88	87	85	84	86%
113	Gemini 3.1 Pro (Preview)	$0.146	2.4m	84%	88	87	87	86	84	86%
60	Qwen 3.6 27B	$0.023	2.4m	81%	91	88	87	87	80	86%
73	MiniMax M3	$0.0065	2.8m	76%	94	90	85	84	78	86%
95	Claude Opus 4.6	$0.082	1.1m	75%	96	88	87	82	77	86%
18	GPT-5.4 Mini	$0.014	15.8s	81%	90	89	87	85	80	86%
34	Claude Sonnet 5 (Reasoning)	$0.038	40.3s	82%	89	88	86	85	83	86%
14	DeepSeek V4 Flash (Reasoning)	$0.0006	28.2s	81%	90	89	87	84	80	86%
27	Writer: Palmyra X5	$0.012	19.1s	80%	90	87	86	81	81	85%
69	Grok 4.20 (Reasoning)	$0.020	1.6m	77%	92	86	83	82	82	85%
31	Qwen 3.6 Flash	$0.012	48.3s	81%	88	88	86	84	81	85%
41	Gemini 2.5 Pro	$0.035	31.3s	81%	88	88	85	83	82	85%
40	Qwen 3.6 35B	$0.012	1.1m	80%	89	86	84	84	82	85%
26	GPT-5.4 Nano (Reasoning, Low)	$0.0045	19.4s	80%	89	88	87	81	78	85%
64	DeepSeek V4 Flash	$0.0008	2.1m	78%	90	87	84	82	80	85%
30	Claude Haiku 4.5	$0.014	22.8s	81%	87	86	84	83	82	84%
44	Grok 4.3	$0.0090	34.9s	78%	91	85	84	81	81	84%
47	Grok 4.3 (Reasoning)	$0.015	1.4m	81%	87	85	84	83	82	84%
87	Qwen3.7 Max	$0.072	2.1m	82%	86	85	85	85	81	84%
97	ByteDance Seed 2.0 Mini	$0.0045	4.9m	80%	88	84	83	83	82	84%
33	Grok 4.20	$0.011	41.9s	81%	86	85	85	83	80	84%
23	GPT-5.4 Nano	$0.0047	16.7s	82%	86	84	84	83	82	84%
92	MoonshotAI: Kimi K2.5	$0.012	4.3m	80%	86	86	86	82	79	84%
38	Mistral Large 3	$0.0037	29.1s	78%	88	85	85	83	77	83%
63	WizardLM 2 8x22b	$0.0038	2.1m	80%	87	84	84	81	81	83%
42	GPT-5.4 Nano (Reasoning)	$0.0047	21.6s	78%	88	83	82	82	80	83%
57	DeepSeek V3 (2025-03-24)	$0.0014	20.7s	74%	90	87	83	81	74	83%
35	Mistral Small 4 (Reasoning)	$0.0022	28.7s	79%	86	85	84	82	78	83%
49	Xiaomi MIMO v2.5 Pro	$0.0090	51.5s	80%	86	85	85	80	78	83%
68	Aion 2.0	$0.0073	1.1m	77%	86	86	83	83	75	83%
67	DeepSeek V3.2	$0.0019	1.4m	77%	87	85	82	81	79	83%
43	Qwen 3.5 Flash	$0.0024	49.3s	80%	86	84	84	80	80	83%
83	GPT-5.2	$0.044	1.2m	78%	87	86	84	78	78	83%
54	Qwen 3.5 9B	$0.0011	1.3m	80%	84	84	83	82	80	82%
88	Gemini 3.5 Flash (Reasoning)	$0.083	42.4s	79%	86	84	83	81	79	82%
61	Mistral Large 2	$0.015	25.6s	77%	86	84	84	82	75	82%
66	Qwen 3.5 35B	$0.013	42.6s	77%	87	82	81	80	80	82%
46	Mistral Medium 3.1	$0.0049	39.2s	80%	84	84	83	81	79	82%
55	Qwen 3 32B	$0.0016	54.7s	79%	84	83	82	81	80	82%
62	Qwen 3.5 27B	$0.011	50.3s	79%	84	83	83	82	77	82%
65	o4 Mini	$0.014	24.6s	77%	84	84	82	78	78	81%
71	Hermes 3 405B	$0.0050	26.4s	75%	86	86	86	80	69	81%
45	Ministral 3 3B	$0.0005	2.9s	78%	83	83	81	80	79	81%
75	Qwen 3.5 122B	$0.014	33.9s	75%	87	81	80	80	77	81%
58	Mistral Small 4	$0.0013	11.2s	77%	85	81	81	79	77	81%
78	o4 Mini High	$0.024	41.4s	77%	84	81	80	79	78	81%
80	GPT-4o, Aug. 6th (temp=1)	$0.021	30.9s	75%	84	84	79	78	77	81%
70	Qwen 3.5 Plus (2026-02-15)	$0.0076	36.0s	77%	83	83	81	80	76	80%
85	GPT-4.1	$0.018	49.1s	75%	85	82	80	78	75	80%
100	MiniMax M2.5	$0.0035	2.0m	73%	85	84	79	77	77	80%
101	Z.AI GLM 4.7	$0.0085	1.3m	71%	88	82	79	76	75	80%
77	Gemma 3 27B	$0.0009	41.6s	74%	84	83	79	76	76	80%
91	Z.AI GLM 4.5	$0.0039	23.4s	70%	88	85	81	78	67	80%
93	ByteDance Seed 2.0 Lite	$0.011	1.9m	76%	82	81	81	80	74	80%
76	Z.AI GLM 4.6	$0.0049	28.0s	75%	83	81	81	79	73	79%
74	Gemini 3.5 Flash (Reasoning, Minimal)	$0.013	8.6s	76%	83	80	79	78	76	79%
81	ByteDance Seed 1.6 Flash	$0.0013	26.4s	74%	83	80	78	76	76	79%
84	Z.AI GLM 4.5 Air	$0.0027	46.6s	75%	82	79	78	77	77	78%
99	DeepSeek V3 (2024-12-26)	$0.0026	37.2s	70%	87	79	79	73	73	78%
96	GPT-5 Mini	$0.0093	1.1m	74%	81	81	80	75	72	78%
82	GPT-4.1 Mini	$0.0030	18.5s	74%	81	79	78	78	74	78%
89	Gemma 3 12B	$0.0008	47.9s	73%	80	80	79	79	71	78%
90	Gemini 2.5 Flash (Reasoning)	$0.010	19.0s	73%	83	79	78	76	73	78%
117	Gemma 4 31B (Reasoning)	$0.0017	2.8m	74%	82	78	77	76	75	78%
103	Z.AI GLM 4.7 Flash	$0.0016	1.1m	71%	82	82	77	75	72	78%
104	Gemma 4 31B	$0.0013	1.7m	73%	81	78	77	76	73	77%
106	DeepSeek V3.1	$0.0022	1.7m	73%	80	80	78	77	71	77%
86	GPT-4o Mini (temp=1)	$0.0015	38.2s	75%	79	78	78	75	75	77%
105	Hermes 3 70B	$0.0016	1.5m	73%	81	80	80	73	70	77%
112	Cohere Command R+ (Aug. 2024)	$0.023	56.0s	73%	80	79	78	75	70	77%
94	Gemini 2.5 Flash	$0.0049	9.8s	72%	80	77	77	76	72	76%
116	DeepSeek-V2 Chat	$0.0025	45.9s	69%	83	80	75	73	71	76%
108	Ministral 8B	$0.0005	8.6s	68%	82	81	76	76	66	76%
123	ByteDance Seed 1.6	$0.0096	1.8m	69%	82	77	74	74	74	76%
111	Ministral 3 14B	$0.0011	8.3s	68%	82	80	74	73	71	76%
125	Gemma 4 26B (Reasoning)	$0.0015	1.6m	67%	84	78	74	72	70	76%
98	Ministral 3B	$0.0002	2.5s	71%	79	79	75	74	72	76%
119	Arcee AI: Trinity Mini	$0.0004	9.4s	65%	86	76	74	72	70	76%
102	Gemini 3 Flash (Preview)	$0.0070	20.2s	72%	78	76	74	74	74	75%
118	Ministral 3 8B	$0.0008	6.2s	66%	82	80	75	75	65	75%
107	Xiaomi MIMO v2.5	$0.0049	27.2s	71%	81	78	77	71	70	75%
115	Gemma 4 26B	$0.0011	59.7s	71%	80	77	76	73	70	75%
114	Gemini 3 Flash (Preview, Reasoning)	$0.011	25.7s	71%	78	77	77	75	67	75%
124	Cydonia 24B V4.1	$0.0017	47.4s	66%	84	77	75	70	66	74%
110	Mistral NeMO	$0.0008	7.6s	70%	79	77	76	70	68	74%
121	Gemini 2.5 Flash Lite (Reasoning)	$0.0025	21.0s	68%	77	76	72	71	69	73%
122	GPT-4.1 Nano	$0.0008	14.9s	67%	79	73	73	69	67	72%
126	GPT-4o Mini (temp=0)	$0.0013	40.4s	67%	76	74	71	71	68	72%
129	Llama 3.1 70B	$0.0026	48.5s	65%	81	75	75	65	63	72%
120	Gemini 2.5 Flash Lite	$0.0010	7.2s	69%	73	73	71	70	70	72%
127	Gemma 3 4B	$0.0003	21.7s	65%	79	72	71	69	66	71%
135	GPT-4o, Aug. 6th (temp=0)	$0.050	40.8s	66%	78	74	74	68	63	71%
130	Qwen 2.5 72B	$0.0013	35.7s	66%	74	73	72	67	63	70%
128	Gemini 3.1 Flash Lite (Reasoning)	$0.0036	9.7s	66%	72	71	69	69	66	69%
131	Gemini 3.1 Flash Lite (Preview)	$0.0036	8.3s	65%	75	72	72	64	64	69%
132	Nemotron 3 Super	$0.0000	51.4s	65%	75	69	69	67	66	69%
134	Gemini 3.1 Flash Lite	$0.0036	21.4s	65%	69	69	67	66	64	67%
140	Mistral Small 3.2 24B	$0.012	10.0m	63%	70	69	67	67	61	67%
136	GPT-OSS 120B	$0.0014	50.5s	62%	71	68	66	65	63	67%
137	GPT-5 Nano	$0.0043	1.5m	62%	72	67	66	65	62	66%
138	Inception Mercury 2	$0.0026	5.5s	59%	69	65	62	61	61	64%
139	Nemotron 3 Nano	$0.0010	38.8s	55%	67	65	61	58	56	61%
81.51%

Median	Evaluator	Top 3	Flop 3
80.0%	"Not X but Y" pattern overuse	100Qwen3.7 Max 100Mistral Large 3 100Claude Sonnet 5 (Reasoning, Low)	0GPT-5 Nano 0Gemini 2.5 Flash Lite 16Gemini 2.5 Flash Lite (Reasoning)
81.1%	Adverb-first sentence starts	100GPT-5.4 Mini (Reasoning, Low) 100Mistral Large 3 100Claude Opus 4.5	7Nemotron 3 Nano 10ByteDance Seed 2.0 Lite 16Mistral Small 3.2 24B
100.0%	Adverbs in dialogue tags	100Aion 3.0 Mini 100DeepSeek V4 Pro (Reasoning) 100Qwen 3.5 35B	33GPT-4o Mini (temp=1) 47Cydonia 24B V4.1 53Claude Sonnet 4
91.8%	AI-ism adverb frequency	100o4 Mini High 99Qwen 3.5 35B 99ByteDance Seed 2.0 Lite	71Cydonia 24B V4.1 72GPT-4.1 Nano 75Z.AI GLM 4.5
100.0%	AI-ism character names	100Mistral Large 2 100Gemini 3.1 Pro (Preview) 100Grok 4.20 (Reasoning)	88DeepSeek V4 Pro 92Claude Opus 4 96Z.AI GLM 5
100.0%	AI-ism location names	100Qwen 3.5 Plus (2026-04-20) 100Qwen 3.5 Flash 100Qwen 3.5 27B	—
32.4%	AI-ism word frequency	76ByteDance Seed 2.0 Mini 72GPT-5.5 (Reasoning) 70GPT-5	0Arcee AI: Trinity Mini 0Inception Mercury 2 0Gemma 4 26B (Reasoning)
100.0%	Cliché density	100Ministral 3 14B 100Gemma 4 26B (Reasoning) 100Claude Sonnet 4.6 (Reasoning)	47Mistral Small 3.2 24B 60Qwen 2.5 72B 67GPT-4.1 Nano
50.9%	Dialogue tag variety (said vs. fancy)	100ByteDance Seed 2.0 Mini 100Gemini 2.5 Pro 100Qwen3 235B A22B Instruct 2507	0Z.AI GLM 4.5 Air 0o4 Mini High 0Gemini 2.5 Flash
91.5%	Em-dash & semicolon overuse	100GPT-5.4 Nano (Reasoning, Low) 100Qwen 3.5 9B 100Claude Opus 4.5	0Mistral Large 3 0GPT-4.1 Mini 0Mistral Large 2
100.0%	Emotion telling (show vs. tell)	100Claude Opus 4.8 (Reasoning) 100Claude Opus 4.6 (Reasoning) 100Z.AI GLM 5	77Mistral Small 3.2 24B 89Mistral NeMO 90Llama 3.1 70B
98.4%	Filter word density	100Claude Opus 4.7 (Reasoning) 100Qwen 3.5 Plus (2026-04-20) 100GPT-5.4 (Reasoning, Low)	11Gemini 2.5 Flash Lite (Reasoning) 15Inception Mercury 2 17Nemotron 3 Nano
100.0%	Gibberish response detection	100DeepSeek V4 Flash 100Claude Sonnet 4.6 (Reasoning) 100Inception Mercury 2	80Llama 3.1 70B 88DeepSeek V3 (2025-03-24) 92Cydonia 24B V4.1
100.0%	Markdown formatting overuse	100Z.AI GLM 4.7 Flash 100GPT-5 Mini 100Qwen 3.5 397B A17B	51ByteDance Seed 1.6 Flash 67Mistral Medium 3.1 80Aion 2.0
100.0%	Missing dialogue indicators (quotation marks)	100WizardLM 2 8x22b 100Aion 2.0 100Grok 4.5 (Reasoning, Low)	80Qwen 3.5 35B 80GPT-5 83Z.AI GLM 4.6
93.3%	Name drop frequency	100Gemini 3.1 Flash Lite 100GPT-5 100Claude Sonnet 5	51Qwen 3.5 9B 51Hermes 3 405B 55Qwen 3.5 27B
76.7%	Narrator intent-glossing	100DeepSeek V4 Pro (Reasoning) 100o4 Mini High 100Gemini 3.1 Pro (Preview)	0Nemotron 3 Nano 3Inception Mercury 2 5GPT-5 Nano
100.0%	Overuse of "that" (subordinate clause padding)	100GPT-5.5 (Reasoning) 100Qwen3.7 Max 100GPT-5.4 Mini (Reasoning, Low)	40ByteDance Seed 2.0 Lite 77Hermes 3 70B 80Mistral Small 3.2 24B
100.0%	Paragraph length variance	100Claude Opus 4 100Mistral Small 4 (Reasoning) 100GPT-4.1	5Grok 4.3 (Reasoning) 32Nemotron 3 Nano 43GPT-4o, Aug. 6th (temp=0)
99.3%	Passive voice overuse	100GPT-5.4 (Reasoning, Low) 100Qwen3.6 Max Preview 100Gemini 3.1 Pro (Preview)	90ByteDance Seed 2.0 Mini 91ByteDance Seed 2.0 Lite 92Cydonia 24B V4.1
96.4%	Past progressive (was/were + -ing) overuse	100GPT-5.5 (Reasoning) 100Inception Mercury 2 100GPT-5.4 Mini (Reasoning, Low)	33Z.AI GLM 4.7 Flash 34Ministral 3 8B 36Gemini 3.1 Flash Lite (Reasoning)
92.0%	Pronoun-first sentence starts	100Claude Opus 4.8 (Reasoning, Low) 100GPT-4.1 Mini 100DeepSeek V3 (2025-03-24)	14Gemini 3.1 Flash Lite (Reasoning) 16Gemini 3.1 Flash Lite 16Gemini 3.1 Flash Lite (Preview)
96.3%	Purple prose (modifier overload)	100ByteDance Seed 2.0 Lite 100ByteDance Seed 1.6 Flash 100Gemma 4 31B	68Cydonia 24B V4.1 79Gemini 3 Flash (Preview, Reasoning) 81Gemini 2.5 Flash (Reasoning)
100.0%	Repeated phrase echo	100GPT-4o, Aug. 6th (temp=1) 100o4 Mini High 100Qwen 3.5 Plus (2026-04-20)	—
100.0%	Sentence length variance	100DeepSeek V3 (2024-12-26) 100DeepSeek V3.2 100Z.AI GLM 5	66Nemotron 3 Nano 79GPT-4o, Aug. 6th (temp=0) 90Inception Mercury 2
47.5%	Sentence opener variety	87DeepSeek V3 (2025-03-24) 83GPT-4o Mini (temp=1) 81Cydonia 24B V4.1	25GPT-5 Nano 30Qwen 3.5 35B 31Gemma 4 26B
52.7%	Subject-first sentence starts	100Qwen3 235B A22B Instruct 2507 100Writer: Palmyra X5 97Gemma 3 27B	0Inception Mercury 2 0Qwen 3.5 9B 0GPT-OSS 120B
35.6%	Subordinate conjunction sentence starts	88Gemini 2.5 Flash Lite 80Claude Opus 4.7 (Reasoning) 77Cydonia 24B V4.1	0Qwen3.6 Max Preview 0Ministral 3B 0Qwen 3.5 122B
66.2%	Technical jargon density	100Gemini 3.1 Pro (Preview) 100Qwen3.6 Max Preview 100Qwen3.7 Max	0GPT-5 Nano 0Nemotron 3 Nano 2Gemini 2.5 Flash Lite (Reasoning)
74.9%	Useless dialogue additions	100GPT-5.4 Mini (Reasoning) 100Aion 3.0 Mini 100Gemini 2.5 Flash Lite (Reasoning)	0GPT-4o Mini (temp=0) 0Gemma 3 12B 0Mistral Small 3.2 24B

Bad Writing Habits

Horror: alone in an eerie place at night

Performance Score Distribution (Top 20)

Price-Performance Score Distribution (Top 20)

Most Stable Models (Top 20)

Top Overall Models (Top 20)