Horror: alone in an eerie place at night

Bad Writing Habits

Detects common prose quality anti-patterns in AI-generated creative writing, including passive voice, past progressive overuse, weak dialogue tags, filter words, purple prose, cliches, AI-ism words/adverbs/names, and more.

Creative Writing Hallucination

Performance Score Distribution (Top 20)

Click a model name to view its detail page.

	Score
GPT-5.4 (Reasoning)	92%
GPT-5.5 (Reasoning)	91%
GPT-5.5 (Reasoning, Low)	91%
GPT-5.4 (Reasoning, Low)	90%
GPT-5.5	90%
GPT-5.4	90%
Qwen 3.5 Plus (2026-04-20)	90%
Qwen3.6 Max Preview	89%
GPT-5.1	88%
Gemini 3.1 Pro (Preview)	87%
MiniMax M3	86%
Qwen 3.5 397B A17B	86%
Qwen 3.6 Flash	86%
GPT-5	86%
GPT-5.4 Mini (Reasoning)	86%
Grok 4.3 (Reasoning)	86%
GPT-5.4 Mini (Reasoning, Low)	86%
Grok 4.20	86%
Grok 4.20 (Reasoning)	86%
Claude Sonnet 4.5	86%

	Score	Cost	Time
Qwen 3.5 Plus (2026-04-20)	90%	$0.028	2.9m
Qwen 3.6 Flash	86%	$0.0086	36.1s
GPT-5.4 Mini	85%	$0.013	15.4s
GPT-5.4 Mini (Reasoning)	86%	$0.014	20.5s
Z.AI GLM 5 Turbo	84%	$0.0060	27.7s
GPT-5.4 Nano	84%	$0.0060	23.9s
GPT-5.4 Mini (Reasoning, Low)	86%	$0.012	15.0s
Mistral Small 4 (Reasoning)	82%	$0.0017	23.6s
Mistral Medium 3.1	83%	$0.0030	25.4s
Writer: Palmyra X5	84%	$0.010	20.4s
DeepSeek V4 Flash (Reasoning)	82%	$0.0005	24.5s
Ministral 8B	80%	$0.0002	5.2s
Hermes 3 405B	82%	$0.0020	35.8s
GPT-5.4 Nano (Reasoning, Low)	82%	$0.0060	21.9s
Qwen3 235B A22B Instruct 2507	85%	$0.0008	1.0m
Grok 4.20	86%	$0.0096	54.7s
GPT-5.4	90%	$0.040	1.2m
MiniMax M3	86%	$0.0033	1.8m
Aion 2.0	83%	$0.0050	1.1m
Qwen 3.5 9B	83%	$0.0018	2.7m

	Score	Consistency	Stability
GPT-5.5 (Reasoning)	91%	98%	89%
GPT-5.5 (Reasoning, Low)	91%	97%	88%
GPT-5.5	90%	97%	87%
GPT-5.4 (Reasoning, Low)	90%	97%	87%
GPT-5.4 (Reasoning)	92%	93%	87%
Gemini 3.1 Pro (Preview)	87%	99%	85%
Qwen 3.5 Plus (2026-04-20)	90%	93%	85%
GPT-5.4	90%	93%	84%
Qwen3.6 Max Preview	89%	95%	84%
GPT-5.4 Mini (Reasoning)	86%	98%	84%
GPT-5.1	88%	95%	84%
Qwen 3.6 Flash	86%	95%	83%
Grok 4.20	86%	96%	83%
GPT-5.4 Mini	85%	96%	83%
Grok 4.20 (Reasoning)	86%	97%	83%
GPT-5.4 Nano	84%	97%	82%
Qwen 3.5 397B A17B	86%	95%	82%
Writer: Palmyra X5	84%	97%	82%
Z.AI GLM 5	85%	96%	82%
Grok 4.3 (Reasoning)	86%	96%	82%

	Score	Cost	Speed	Stability
GPT-5.4 Mini (Reasoning)	86%	$0.014	20.5s	84%
Qwen 3.6 Flash	86%	$0.0086	36.1s	83%
GPT-5.4 Mini (Reasoning, Low)	86%	$0.012	15.0s	82%
GPT-5.4 Mini	85%	$0.013	15.4s	83%
GPT-5.4 Nano	84%	$0.0060	23.9s	82%
Writer: Palmyra X5	84%	$0.010	20.4s	82%
Grok 4.20	86%	$0.0096	54.7s	83%
GPT-5.4 (Reasoning, Low)	90%	$0.041	1.1m	87%
Z.AI GLM 5	85%	$0.0061	59.0s	82%
Qwen3 235B A22B Instruct 2507	85%	$0.0008	1.0m	80%
GPT-5.4 Nano (Reasoning)	83%	$0.0054	24.4s	80%
Mistral Medium 3.1	83%	$0.0030	25.4s	80%
GPT-5.4	90%	$0.040	1.2m	84%
Mistral Small 4 (Reasoning)	82%	$0.0017	23.6s	79%
Grok 4.3	82%	$0.0055	27.9s	80%
MiniMax M2.7	83%	$0.0040	55.8s	80%
GPT-5.4 Nano (Reasoning, Low)	82%	$0.0060	21.9s	79%
Qwen 3.5 Flash	83%	$0.0021	44.4s	79%
Aion 3.0 Mini	83%	$0.0027	49.9s	79%
Qwen 3.6 35B	83%	$0.0049	41.0s	78%

Rank	Model	Avg. Cost	Avg. Time	Stability	# 1	# 2	# 3	# 4	# 5	Total
42	GPT-5.4 (Reasoning)	$0.063	1.9m	87%	96	93	93	91	85	92%
72	GPT-5.5 (Reasoning)	$0.109	1.4m	89%	93	92	91	91	89	91%
104	GPT-5.5 (Reasoning, Low)	$0.126	1.6m	88%	93	91	91	90	89	91%
8	GPT-5.4 (Reasoning, Low)	$0.041	1.1m	87%	93	91	90	89	89	90%
83	GPT-5.5	$0.109	1.4m	87%	93	91	90	89	89	90%
13	GPT-5.4	$0.040	1.2m	84%	96	91	90	87	86	90%
44	Qwen 3.5 Plus (2026-04-20)	$0.028	2.9m	85%	94	92	91	87	84	90%
79	Qwen3.6 Max Preview	$0.046	3.4m	84%	92	91	89	87	84	89%
50	GPT-5.1	$0.042	2.2m	84%	91	90	88	88	84	88%
92	Gemini 3.1 Pro (Preview)	$0.088	1.6m	85%	88	87	87	87	86	87%
21	MiniMax M3	$0.0033	1.8m	81%	90	89	86	86	80	86%
22	Qwen 3.5 397B A17B	$0.0084	2.0m	82%	90	87	87	85	82	86%
2	Qwen 3.6 Flash	$0.0086	36.1s	83%	89	88	88	84	82	86%
116	GPT-5	$0.060	2.9m	79%	91	90	85	84	81	86%
1	GPT-5.4 Mini (Reasoning)	$0.014	20.5s	84%	88	87	86	85	85	86%
95	Grok 4.3 (Reasoning)	$0.033	3.6m	82%	90	87	86	85	83	86%
3	GPT-5.4 Mini (Reasoning, Low)	$0.012	15.0s	82%	90	86	85	85	84	86%
7	Grok 4.20	$0.0096	54.7s	83%	88	87	87	84	83	86%
41	Grok 4.20 (Reasoning)	$0.020	1.9m	83%	89	86	85	84	84	86%
34	Claude Sonnet 4.5	$0.029	38.6s	79%	90	89	85	82	82	86%
9	Z.AI GLM 5	$0.0061	59.0s	82%	89	85	85	84	84	85%
10	Qwen3 235B A22B Instruct 2507	$0.0008	1.0m	80%	90	87	84	84	82	85%
4	GPT-5.4 Mini	$0.013	15.4s	83%	87	87	86	83	82	85%
57	Qwen 3.6 27B	$0.023	2.4m	82%	88	87	86	83	81	85%
5	GPT-5.4 Nano	$0.0060	23.9s	82%	86	86	85	83	82	84%
46	Z.AI GLM 5.1	$0.010	1.0m	75%	93	88	85	80	77	84%
6	Writer: Palmyra X5	$0.010	20.4s	82%	86	85	85	83	82	84%
76	Claude Opus 4.8 (Reasoning, Low)	$0.058	39.3s	78%	89	87	84	81	79	84%
100	Qwen3.7 Max	$0.055	2.0m	81%	87	86	85	81	80	84%
23	Z.AI GLM 5 Turbo	$0.0060	27.7s	77%	91	87	85	80	77	84%
55	Claude Sonnet 4.6 (Reasoning)	$0.029	40.8s	76%	90	87	82	82	78	84%
129	MoonshotAI: Kimi K2.6	$0.063	5.1m	81%	86	84	83	83	82	84%
45	Qwen 3.5 35B	$0.021	1.2m	80%	87	85	84	81	81	84%
20	Qwen 3.6 35B	$0.0049	41.0s	78%	88	85	82	82	81	83%
108	Claude Opus 4.6	$0.070	1.2m	78%	88	84	83	83	79	83%
25	Z.AI GLM 5.2 (Reasoning, High)	$0.0090	56.9s	80%	86	84	84	83	79	83%
103	Claude Opus 4.5	$0.062	51.5s	77%	89	86	83	80	79	83%
16	MiniMax M2.7	$0.0040	55.8s	80%	86	85	84	83	79	83%
35	Aion 2.0	$0.0050	1.1m	78%	86	85	85	84	76	83%
12	Mistral Medium 3.1	$0.0030	25.4s	80%	85	85	84	84	78	83%
19	Aion 3.0 Mini	$0.0027	49.9s	79%	86	85	82	82	81	83%
73	Qwen 3.5 9B	$0.0018	2.7m	77%	89	86	85	78	77	83%
48	MiniMax M2.5	$0.0030	1.8m	79%	88	85	84	79	79	83%
11	GPT-5.4 Nano (Reasoning)	$0.0054	24.4s	80%	85	84	83	83	80	83%
88	Claude Opus 4.8 (Reasoning)	$0.056	38.6s	78%	88	84	83	81	77	83%
18	Qwen 3.5 Flash	$0.0021	44.4s	79%	86	84	83	81	80	83%
130	Claude Opus 4	$0.131	56.4s	78%	86	86	85	79	76	83%
123	Qwen 3.5 27B	$0.052	4.0m	81%	84	83	83	81	81	82%
29	Hermes 3 405B	$0.0020	35.8s	76%	87	86	83	80	75	82%
54	Claude Sonnet 4.6	$0.023	35.5s	75%	87	86	82	81	75	82%
49	Claude Sonnet 5 (Reasoning, Low)	$0.023	33.2s	78%	87	83	82	80	79	82%
40	Aion 3.0	$0.016	43.1s	79%	85	84	83	81	79	82%
106	Gemini 3.5 Flash (Reasoning)	$0.074	39.7s	78%	86	83	83	81	78	82%
37	DeepSeek V4 Pro	$0.0029	51.1s	77%	87	85	83	81	76	82%
14	Mistral Small 4 (Reasoning)	$0.0017	23.6s	79%	85	85	84	79	77	82%
96	ByteDance Seed 2.0 Lite	$0.013	2.4m	75%	86	85	80	80	78	82%
33	DeepSeek V4 Flash (Reasoning)	$0.0005	24.5s	75%	90	84	83	77	76	82%
65	Claude Sonnet 5	$0.022	31.0s	74%	90	82	82	81	75	82%
17	GPT-5.4 Nano (Reasoning, Low)	$0.0060	21.9s	79%	85	84	83	80	78	82%
56	Gemini 2.5 Pro	$0.031	32.1s	78%	86	83	83	78	78	82%
47	o4 Mini High	$0.013	23.0s	75%	88	83	81	81	76	82%
122	Claude Opus 4.6 (Reasoning)	$0.076	1.3m	76%	87	85	82	79	77	82%
31	Mistral Large 3	$0.0023	25.5s	76%	87	83	80	80	80	82%
15	Grok 4.3	$0.0055	27.9s	80%	84	83	82	80	80	82%
24	Qwen 3 32B	$0.0012	30.5s	78%	84	84	82	79	79	82%
30	Xiaomi MIMO v2.5 Pro	$0.0073	49.6s	80%	83	82	81	81	80	81%
87	WizardLM 2 8x22b	$0.0018	2.7m	75%	88	84	84	77	74	81%
36	ByteDance Seed 1.6 Flash	$0.0009	21.4s	75%	87	81	80	79	79	81%
43	Cohere Command R+ (Aug. 2024)	$0.014	50.8s	80%	82	81	81	80	80	81%
28	Claude Haiku 4.5	$0.0093	19.7s	78%	84	84	84	78	77	81%
27	Gemma 3 27B	$0.0004	36.1s	77%	84	82	80	80	79	81%
39	DeepSeek V3 (2025-03-24)	$0.0009	46.1s	77%	84	83	83	80	75	81%
127	Qwen 3.5 122B	$0.067	2.6m	75%	86	83	82	79	73	81%
58	Z.AI GLM 4.6	$0.0064	1.2m	76%	84	82	80	79	77	80%
26	Ministral 8B	$0.0002	5.2s	76%	84	83	82	79	74	80%
38	Mistral Small 4	$0.0011	15.2s	76%	83	83	79	79	78	80%
107	MoonshotAI: Kimi K2.5	$0.017	2.9m	77%	83	81	80	79	79	80%
32	DeepSeek V4 Flash	$0.0005	25.0s	77%	82	82	80	79	77	80%
99	Claude Opus 4.7 (Reasoning)	$0.059	27.9s	78%	82	82	82	78	77	80%
98	GPT-5.2	$0.042	1.2m	78%	83	82	81	78	78	80%
124	ByteDance Seed 2.0 Mini	$0.0045	5.2m	76%	84	81	81	79	76	80%
105	Claude Sonnet 5 (Reasoning)	$0.022	32.6s	68%	87	86	77	75	72	80%
77	DeepSeek V4 Pro (Reasoning)	$0.0047	1.4m	73%	84	82	79	79	73	79%
71	Gemma 4 31B (Reasoning)	$0.0013	1.4m	75%	84	80	80	77	76	79%
119	Claude Opus 4.7	$0.057	28.2s	72%	87	80	79	77	72	79%
62	GPT-4o, Aug. 6th (temp=1)	$0.016	21.1s	75%	83	80	80	76	76	79%
89	Claude Sonnet 4	$0.024	37.5s	73%	83	81	77	77	75	79%
59	Xiaomi MIMO v2.5	$0.0042	25.4s	73%	84	79	78	78	75	79%
120	Hermes 3 70B	$0.0007	2.7m	69%	89	85	85	71	65	79%
82	GPT-4.1	$0.016	47.2s	73%	83	82	79	76	73	79%
51	Gemini 2.5 Flash Lite	$0.0007	7.9s	72%	85	80	78	76	75	79%
85	DeepSeek V3 (2024-12-26)	$0.0015	49.9s	69%	88	79	79	77	70	78%
86	Qwen 3.5 Plus (2026-02-15)	$0.0053	29.2s	69%	85	83	76	75	74	78%
68	Ministral 3 14B	$0.0004	8.7s	69%	87	81	79	77	67	78%
90	DeepSeek V3.2	$0.0010	1.8m	74%	83	79	78	77	75	78%
52	Ministral 3 8B	$0.0003	5.2s	72%	83	82	80	77	70	78%
69	o4 Mini	$0.012	22.0s	74%	83	79	78	76	74	78%
63	Mistral Large 2	$0.0074	20.2s	74%	82	81	80	74	73	78%
109	DeepSeek V3.1	$0.0017	1.4m	67%	86	84	77	76	68	78%
78	Gemini 3 Flash (Preview, Reasoning)	$0.013	34.1s	73%	81	80	76	76	76	78%
60	Gemini 2.5 Flash (Reasoning)	$0.0093	18.2s	75%	81	80	80	75	73	78%
67	Z.AI GLM 4.5	$0.0047	41.0s	74%	81	78	78	76	75	78%
70	Gemini 3.5 Flash (Reasoning, Minimal)	$0.016	10.9s	74%	81	79	79	74	74	77%
64	Gemma 3 12B	$0.0002	34.5s	74%	80	78	76	76	76	77%
74	Z.AI GLM 4.7 Flash	$0.0015	1.1m	75%	79	78	78	76	74	77%
91	Gemini 2.5 Flash	$0.0038	8.4s	67%	84	82	74	73	72	77%
66	Gemini 3.1 Flash Lite	$0.0027	8.7s	72%	82	79	77	76	72	77%
61	Gemini 3.1 Flash Lite (Reasoning)	$0.0027	17.2s	74%	80	77	77	76	74	77%
75	GPT-4.1 Mini	$0.0023	17.7s	70%	81	80	75	75	74	77%
97	Z.AI GLM 4.7	$0.0097	42.3s	71%	83	76	76	76	72	77%
53	Ministral 3B	$0.0001	2.3s	73%	79	79	76	75	74	77%
93	GPT-5 Mini	$0.0089	54.8s	73%	80	79	77	74	72	76%
102	Gemma 4 31B	$0.0008	1.8m	74%	79	77	77	75	73	76%
113	Arcee AI: Trinity Mini	$0.0003	8.6s	61%	86	84	71	71	69	76%
81	Gemini 3 Flash (Preview)	$0.0077	19.5s	72%	80	76	75	75	73	76%
128	Gemma 4 26B (Reasoning)	$0.0011	3.9m	69%	79	79	76	74	66	75%
112	Gemma 4 26B	$0.0007	1.0m	68%	82	75	73	73	72	75%
94	GPT-4o Mini (temp=1)	$0.0010	33.8s	71%	79	75	74	74	72	75%
117	Gemini 3.1 Flash Lite (Preview)	$0.0026	8.1s	62%	84	82	71	71	67	75%
80	GPT-4o Mini (temp=0)	$0.0010	36.1s	73%	77	75	75	73	73	75%
101	GPT-4.1 Nano	$0.0006	13.0s	68%	82	74	74	74	67	74%
110	Cydonia 24B V4.1	$0.0010	34.7s	67%	79	77	72	71	70	74%
125	Z.AI GLM 4.5 Air	$0.0023	2.3m	66%	80	75	71	71	70	73%
84	Mistral NeMO	$0.0003	7.1s	71%	75	74	74	73	71	73%
118	DeepSeek-V2 Chat	$0.0016	55.5s	67%	80	77	76	69	64	73%
111	Ministral 3 3B	$0.0002	3.1s	66%	77	77	73	68	65	72%
114	Qwen 2.5 72B	$0.0007	42.1s	70%	73	73	73	72	68	72%
115	Gemma 3 4B	$0.0002	19.4s	67%	76	75	73	71	64	72%
132	ByteDance Seed 1.6	$0.016	3.0m	68%	74	73	73	69	65	71%
121	Llama 3.1 70B	$0.0007	19.1s	66%	76	71	71	67	65	70%
138	Mistral Small 3.2 24B	$0.0084	6.6m	57%	84	72	67	65	61	70%
126	Gemini 2.5 Flash Lite (Reasoning)	$0.0024	33.9s	62%	76	74	70	62	61	69%
136	Nemotron 3 Super	$0.0000	4.0m	61%	75	69	67	67	61	68%
133	GPT-4o, Aug. 6th (temp=0)	$0.016	23.0s	56%	78	72	65	63	60	68%
131	Inception Mercury 2	$0.0022	5.7s	59%	66	66	62	60	60	63%
135	GPT-OSS 120B	$0.0011	2.2m	58%	66	65	62	61	58	63%
134	GPT-5 Nano	$0.0037	1.3m	60%	64	62	62	61	60	62%
137	Nemotron 3 Nano	$0.0012	2.0m	54%	69	69	63	55	53	62%
80.11%

Median	Evaluator	Top 3	Flop 3
84.2%	"Not X but Y" pattern overuse	100Gemini 3.1 Flash Lite 100Z.AI GLM 5 100Qwen 2.5 72B	0GPT-5 Nano 0Gemini 2.5 Flash Lite (Reasoning) 20Nemotron 3 Nano
80.0%	Adverb-first sentence starts	100Cohere Command R+ (Aug. 2024) 100GPT-5.4 (Reasoning, Low) 100Claude Sonnet 4.5	0ByteDance Seed 1.6 0GPT-OSS 120B 17Qwen 3.5 27B
100.0%	Adverbs in dialogue tags	100Grok 4.3 (Reasoning) 100Grok 4.20 (Reasoning) 100Qwen 3.5 Plus (2026-04-20)	20GPT-5 Nano 40Mistral Large 3 51Claude Opus 4.7 (Reasoning)
91.4%	AI-ism adverb frequency	100MoonshotAI: Kimi K2.6 99ByteDance Seed 2.0 Lite 99GPT-5.5 (Reasoning)	74Claude Sonnet 4.6 75Cohere Command R+ (Aug. 2024) 75Claude Sonnet 4.6 (Reasoning)
100.0%	AI-ism character names	100GPT-OSS 120B 100Qwen 3.6 35B 100ByteDance Seed 1.6 Flash	96ByteDance Seed 2.0 Lite
100.0%	AI-ism location names	100Qwen 3.5 Flash 100Aion 3.0 100MiniMax M2.7	—
25.1%	AI-ism word frequency	70GPT-5.5 (Reasoning, Low) 70ByteDance Seed 2.0 Lite 70GPT-5.5	0Nemotron 3 Nano 0Inception Mercury 2 0GPT-4o Mini (temp=1)
100.0%	Cliché density	100Cohere Command R+ (Aug. 2024) 100Cydonia 24B V4.1 100Qwen 3.6 Flash	33Inception Mercury 2 53GPT-4o, Aug. 6th (temp=0) 67Mistral Small 3.2 24B
58.3%	Dialogue tag variety (said vs. fancy)	100MiniMax M2.5 100Qwen 3.5 Plus (2026-04-20) 100Qwen3.6 Max Preview	0GPT-4o Mini (temp=1) 0GPT-5.2 0Hermes 3 70B
33.8%	Em-dash & semicolon overuse	100Qwen 3.6 Flash 100Gemini 2.5 Flash (Reasoning) 100Grok 4.20 (Reasoning)	0Claude Opus 4.8 (Reasoning) 0Claude Opus 4.6 (Reasoning) 0Mistral Medium 3.1
100.0%	Emotion telling (show vs. tell)	100DeepSeek V4 Flash 100Qwen 3.5 Plus (2026-02-15) 100Z.AI GLM 4.7 Flash	92Hermes 3 70B 96Mistral NeMO 97Qwen 2.5 72B
98.6%	Filter word density	100Grok 4.3 100Qwen 3.5 397B A17B 100Claude Opus 4.6	0Nemotron 3 Nano 0Llama 3.1 70B 2Inception Mercury 2
100.0%	Gibberish response detection	100Claude Sonnet 4.5 100Qwen3.6 Max Preview 100ByteDance Seed 2.0 Lite	80Qwen 3.5 Plus (2026-04-20) 80Qwen 3.5 9B 97Hermes 3 70B
100.0%	Markdown formatting overuse	100GPT-4o Mini (temp=0) 100Claude Opus 4.6 (Reasoning) 100GPT-5.4 Nano (Reasoning)	63Ministral 3 3B 69ByteDance Seed 1.6 Flash 72Mistral Large 2
100.0%	Missing dialogue indicators (quotation marks)	100Z.AI GLM 5 Turbo 100Z.AI GLM 4.5 100DeepSeek V4 Pro	60Qwen 3.5 Flash 80Qwen 3.6 35B 80Qwen 3.5 35B
96.0%	Name drop frequency	100Gemma 4 26B (Reasoning) 100DeepSeek V4 Pro (Reasoning) 100Gemini 3.1 Flash Lite (Reasoning)	47Mistral Small 4 53GPT-5.5 56GPT-5.5 (Reasoning, Low)
71.5%	Narrator intent-glossing	100GPT-5.4 (Reasoning) 100GPT-5.5 100Gemini 3.1 Pro (Preview)	0GPT-OSS 120B 0Nemotron 3 Nano 0GPT-5 Nano
100.0%	Overuse of "that" (subordinate clause padding)	100Aion 3.0 100GPT-5.5 100Gemini 2.5 Flash Lite (Reasoning)	75Claude Sonnet 5 (Reasoning, Low) 78GPT-5 Nano 79Nemotron 3 Nano
100.0%	Paragraph length variance	100MiniMax M2.5 100GPT-5.4 Mini (Reasoning) 100Z.AI GLM 5 Turbo	17Mistral Small 3.2 24B 20Grok 4.3 35Gemini 2.5 Flash
98.7%	Passive voice overuse	100Qwen 3.5 9B 100o4 Mini High 100Claude Opus 4.8 (Reasoning, Low)	86Llama 3.1 70B 89DeepSeek V3.1 90Mistral Small 3.2 24B
96.5%	Past progressive (was/were + -ing) overuse	100Grok 4.20 (Reasoning) 100Grok 4.3 100Gemini 3.1 Pro (Preview)	26Claude Opus 4.7 (Reasoning) 32Z.AI GLM 4.6 33Z.AI GLM 4.7
87.2%	Pronoun-first sentence starts	100GPT-5.4 (Reasoning) 100Claude Opus 4 100DeepSeek V4 Pro	1ByteDance Seed 1.6 17Mistral Small 3.2 24B 17ByteDance Seed 2.0 Mini
96.4%	Purple prose (modifier overload)	100Qwen 3.5 35B 100Z.AI GLM 5 100GPT-4o Mini (temp=0)	80ByteDance Seed 2.0 Mini 84Claude Sonnet 5 (Reasoning, Low) 85Gemma 3 27B
100.0%	Repeated phrase echo	100MoonshotAI: Kimi K2.5 100GPT-4.1 100WizardLM 2 8x22b	—
100.0%	Sentence length variance	100MoonshotAI: Kimi K2.6 100MiniMax M2.5 100Cydonia 24B V4.1	66GPT-4o, Aug. 6th (temp=0) 77Qwen 3.5 9B 81Nemotron 3 Nano
41.6%	Sentence opener variety	87GPT-4o, Aug. 6th (temp=1) 75Hermes 3 70B 73Claude Sonnet 5 (Reasoning)	27GPT-5 Nano 28Gemma 4 26B 29Mistral Small 3.2 24B
43.5%	Subject-first sentence starts	100Qwen3 235B A22B Instruct 2507 98Writer: Palmyra X5 97Claude Sonnet 4.5	0Qwen 3.5 27B 0Inception Mercury 2 0GPT-OSS 120B
31.5%	Subordinate conjunction sentence starts	84GPT-4o, Aug. 6th (temp=1) 82Mistral NeMO 76Hermes 3 70B	0Grok 4.20 (Reasoning) 0DeepSeek V3 (2025-03-24) 0Ministral 3B
65.1%	Technical jargon density	100Qwen 3.5 397B A17B 100Qwen 3.5 35B 100GPT-5.5 (Reasoning, Low)	0GPT-5 Nano 0Nemotron 3 Nano 9MoonshotAI: Kimi K2.6
80.0%	Useless dialogue additions	100Aion 3.0 100Qwen 3.6 Flash 100GPT-5.5 (Reasoning)	0GPT-4o Mini (temp=0) 0Gemma 3 12B 0Gemma 3 4B

Bad Writing Habits

Horror: alone in an eerie place at night

Performance Score Distribution (Top 20)

Price-Performance Score Distribution (Top 20)

Most Stable Models (Top 20)

Top Overall Models (Top 20)