Romance: separated couple reunites

Bad Writing Habits

Detects common prose quality anti-patterns in AI-generated creative writing, including passive voice, past progressive overuse, weak dialogue tags, filter words, purple prose, cliches, AI-ism words/adverbs/names, and more.

Creative Writing Hallucination

Performance Score Distribution (Top 20)

Click a model name to view its detail page.

	Score
GPT-5.4 (Reasoning, Low)	92%
GPT-5.4	92%
Writer: Palmyra X5	90%
Grok 4.5 (Reasoning, Low)	90%
Z.AI GLM 5.2 (Reasoning, High)	90%
GPT-5.5	89%
Claude Opus 4.6 (Reasoning)	89%
GPT-5.4 Mini	89%
GPT-5.5 (Reasoning)	89%
Claude Sonnet 4.5	88%
DeepSeek V4 Flash (Reasoning)	88%
GPT-5.4 Mini (Reasoning, Low)	88%
Claude Sonnet 4	88%
GPT-5.5 (Reasoning, Low)	88%
Z.AI GLM 5	88%
Claude Opus 4.7	88%
Claude Opus 4	88%
GPT-5.4 (Reasoning)	88%
Aion 3.0	88%
DeepSeek V4 Pro	88%

	Score	Cost	Time
Writer: Palmyra X5	90%	$0.013	22.8s
Z.AI GLM 5.2 (Reasoning, High)	90%	$0.012	57.5s
Grok 4.5 (Reasoning, Low)	90%	$0.019	50.7s
GPT-5.4 (Reasoning, Low)	92%	$0.056	1.3m
GPT-5.4 Mini	89%	$0.014	15.7s
GPT-5.4 Mini (Reasoning, Low)	88%	$0.014	16.1s
GPT-5.4	92%	$0.051	1.3m
DeepSeek V4 Flash (Reasoning)	88%	$0.0007	25.0s
Grok 4.20	87%	$0.011	46.5s
Grok 4.20 (Reasoning)	87%	$0.014	1.0m
Claude Sonnet 4	88%	$0.045	54.2s
DeepSeek V4 Flash	86%	$0.0005	17.3s
Hermes 3 405B	84%	$0.0054	49.2s
DeepSeek V4 Pro	88%	$0.0027	1.1m
Qwen 3.5 Flash	85%	$0.0024	35.9s
Qwen3 235B A22B Instruct 2507	87%	$0.0011	49.4s
Z.AI GLM 5.1	87%	$0.014	1.2m
MiniMax M2.5	85%	$0.0043	1.8m
Z.AI GLM 5	88%	$0.012	1.8m
Claude Sonnet 4.5	88%	$0.045	41.1s

	Score	Consistency	Stability
GPT-5.4 (Reasoning, Low)	92%	97%	89%
GPT-5.4	92%	95%	88%
GPT-5.5 (Reasoning)	89%	98%	87%
DeepSeek V4 Pro	88%	99%	87%
GPT-5.5 (Reasoning, Low)	88%	97%	87%
GPT-5.5	89%	97%	86%
GPT-5.4 Mini	89%	97%	86%
Writer: Palmyra X5	90%	95%	85%
Qwen 3.5 397B A17B	87%	98%	85%
Claude Opus 4.6 (Reasoning)	89%	95%	85%
GPT-5.4 Mini (Reasoning, Low)	88%	96%	85%
GPT-5.4 Mini (Reasoning)	87%	98%	85%
Z.AI GLM 5.2 (Reasoning, High)	90%	93%	84%
Xiaomi MIMO v2.5 Pro	87%	97%	84%
Qwen3.6 Max Preview	87%	97%	84%
Grok 4.20 (Reasoning)	87%	94%	83%
GPT-5.4 (Reasoning)	88%	95%	83%
Claude Opus 4.8 (Reasoning)	87%	94%	83%
Claude Opus 4.6	87%	96%	83%
Qwen 3.6 Flash	87%	96%	83%

	Score	Cost	Speed	Stability
GPT-5.4 (Reasoning, Low)	92%	$0.056	1.3m	89%
Writer: Palmyra X5	90%	$0.013	22.8s	85%
GPT-5.4 Mini	89%	$0.014	15.7s	86%
GPT-5.4	92%	$0.051	1.3m	88%
GPT-5.4 Mini (Reasoning, Low)	88%	$0.014	16.1s	85%
Z.AI GLM 5.2 (Reasoning, High)	90%	$0.012	57.5s	84%
DeepSeek V4 Pro	88%	$0.0027	1.1m	87%
DeepSeek V4 Flash (Reasoning)	88%	$0.0007	25.0s	83%
GPT-5.4 Mini (Reasoning)	87%	$0.022	26.8s	85%
Grok 4.5 (Reasoning, Low)	90%	$0.019	50.7s	82%
Qwen3 235B A22B Instruct 2507	87%	$0.0011	49.4s	83%
Grok 4.20	87%	$0.011	46.5s	83%
Xiaomi MIMO v2.5 Pro	87%	$0.013	1.2m	84%
Grok 4.20 (Reasoning)	87%	$0.014	1.0m	83%
Qwen 3.6 Flash	87%	$0.013	52.6s	83%
Claude Sonnet 4.5	88%	$0.045	41.1s	82%
Z.AI GLM 5	88%	$0.012	1.8m	83%
DeepSeek V4 Flash	86%	$0.0005	17.3s	79%
Claude Opus 4.6 (Reasoning)	89%	$0.098	1.3m	85%
GPT-5.5	89%	$0.118	1.5m	86%

Rank	Model	Avg. Cost	Avg. Time	Stability	# 1	# 2	# 3	# 4	# 5	Total
1	GPT-5.4 (Reasoning, Low)	$0.056	1.3m	89%	95	94	93	91	90	92%
4	GPT-5.4	$0.051	1.3m	88%	95	93	92	90	88	92%
2	Writer: Palmyra X5	$0.013	22.8s	85%	94	91	90	89	86	90%
10	Grok 4.5 (Reasoning, Low)	$0.019	50.7s	82%	96	93	90	87	82	90%
6	Z.AI GLM 5.2 (Reasoning, High)	$0.012	57.5s	84%	93	93	90	87	85	90%
20	GPT-5.5	$0.118	1.5m	86%	92	90	89	88	88	89%
19	Claude Opus 4.6 (Reasoning)	$0.098	1.3m	85%	91	91	89	89	85	89%
3	GPT-5.4 Mini	$0.014	15.7s	86%	91	89	88	88	87	89%
32	GPT-5.5 (Reasoning)	$0.140	1.7m	87%	90	89	89	88	87	89%
16	Claude Sonnet 4.5	$0.045	41.1s	82%	94	90	88	87	83	88%
8	DeepSeek V4 Flash (Reasoning)	$0.0007	25.0s	83%	93	90	88	86	86	88%
5	GPT-5.4 Mini (Reasoning, Low)	$0.014	16.1s	85%	91	90	88	88	85	88%
28	Claude Sonnet 4	$0.045	54.2s	79%	96	91	90	86	78	88%
29	GPT-5.5 (Reasoning, Low)	$0.122	1.5m	87%	89	89	89	89	86	88%
17	Z.AI GLM 5	$0.012	1.8m	83%	92	90	88	86	83	88%
21	Claude Opus 4.7	$0.085	30.7s	82%	93	90	89	85	82	88%
135	Claude Opus 4	$0.434	2.8m	80%	94	89	86	85	85	88%
60	GPT-5.4 (Reasoning)	$0.092	2.6m	83%	91	89	88	87	84	88%
33	Aion 3.0	$0.031	1.1m	78%	95	90	87	84	82	88%
7	DeepSeek V4 Pro	$0.0027	1.1m	87%	88	88	88	87	87	88%
25	Z.AI GLM 5.1	$0.014	1.2m	80%	93	91	87	85	81	87%
13	Xiaomi MIMO v2.5 Pro	$0.013	1.2m	84%	90	88	87	86	86	87%
14	Grok 4.20 (Reasoning)	$0.014	1.0m	83%	91	89	89	85	82	87%
11	Qwen3 235B A22B Instruct 2507	$0.0011	49.4s	83%	90	89	86	85	85	87%
12	Grok 4.20	$0.011	46.5s	83%	90	89	89	87	81	87%
78	Qwen 3.5 397B A17B	$0.0045	6.2m	85%	88	88	87	87	86	87%
9	GPT-5.4 Mini (Reasoning)	$0.022	26.8s	85%	88	88	86	86	86	87%
15	Qwen 3.6 Flash	$0.013	52.6s	83%	90	87	87	86	84	87%
52	Claude Sonnet 4.6 (Reasoning)	$0.085	1.3m	82%	89	89	88	88	80	87%
26	Claude Opus 4.8 (Reasoning)	$0.083	39.5s	83%	89	89	88	84	82	87%
51	Claude Opus 4.5	$0.086	54.6s	80%	93	86	86	85	84	87%
66	Qwen3.6 Max Preview	$0.057	3.7m	84%	89	88	87	86	84	87%
39	Claude Opus 4.6	$0.088	1.2m	83%	89	88	87	85	84	87%
36	Grok 4.5 (Reasoning, High)	$0.035	1.9m	82%	89	89	86	86	83	86%
18	DeepSeek V4 Flash	$0.0005	17.3s	79%	93	89	86	82	82	86%
85	MoonshotAI: Kimi K2.5	$0.019	5.2m	82%	90	89	88	83	81	86%
38	Qwen 3.5 Plus (2026-04-20)	$0.018	1.7m	81%	91	86	85	85	84	86%
46	Claude Opus 4.8 (Reasoning, Low)	$0.084	40.7s	81%	90	87	86	85	82	86%
37	Aion 3.0 Mini	$0.0071	1.3m	79%	91	89	85	84	82	86%
53	GPT-5.1	$0.054	1.7m	82%	90	87	86	84	82	86%
57	Claude Sonnet 5 (Reasoning)	$0.038	42.7s	76%	94	86	84	83	82	86%
31	Claude Opus 4.7 (Reasoning)	$0.086	28.8s	83%	88	86	85	85	84	86%
43	DeepSeek V4 Pro (Reasoning)	$0.016	2.1m	82%	89	87	86	85	82	86%
35	Claude Sonnet 5 (Reasoning, Low)	$0.040	43.3s	80%	89	87	84	84	83	86%
41	MiniMax M2.5	$0.0043	1.8m	81%	89	87	86	85	80	85%
83	MiniMax M3	$0.0089	3.8m	79%	89	88	84	82	82	85%
48	Z.AI GLM 5 Turbo	$0.013	40.7s	77%	92	88	85	81	79	85%
22	Grok 4.3	$0.0088	25.8s	80%	88	87	84	83	82	85%
23	Qwen 3.5 Flash	$0.0024	35.9s	80%	89	86	86	84	79	85%
45	Grok 4.3 (Reasoning)	$0.016	1.7m	81%	88	85	85	84	82	85%
42	MiniMax M2.7	$0.0053	1.3m	80%	88	87	84	82	81	84%
47	Qwen 3.6 35B	$0.0075	1.0m	79%	89	86	84	83	80	84%
40	Claude Sonnet 4.6	$0.038	37.6s	80%	86	86	84	84	80	84%
137	MoonshotAI: Kimi K2.6	$0.085	9.1m	79%	87	86	83	82	81	84%
44	Hermes 3 405B	$0.0054	49.2s	79%	88	87	87	81	76	84%
82	GPT-5	$0.059	2.8m	80%	87	84	83	83	82	84%
93	Gemini 3.1 Pro (Preview)	$0.134	2.1m	81%	87	84	84	83	81	84%
30	Mistral Medium 3.1	$0.0058	42.8s	81%	86	83	83	83	83	84%
49	Qwen 3.5 27B	$0.012	58.0s	79%	86	85	83	82	81	83%
24	GPT-5.4 Nano (Reasoning)	$0.0055	23.1s	81%	86	85	85	81	80	83%
58	o4 Mini	$0.015	26.6s	77%	90	83	82	81	81	83%
27	GPT-5.4 Nano	$0.0050	19.6s	81%	85	84	83	82	82	83%
34	Mistral Small 4 (Reasoning)	$0.0027	32.6s	81%	85	84	83	83	81	83%
74	WizardLM 2 8x22b	$0.0042	3.0m	81%	85	84	84	81	81	83%
87	Qwen3.7 Max	$0.078	2.3m	80%	85	83	82	82	82	83%
55	Hermes 3 70B	$0.0015	38.9s	78%	87	85	84	83	76	83%
56	GPT-5.4 Nano (Reasoning, Low)	$0.0045	16.6s	77%	87	86	82	81	78	83%
71	DeepSeek V3.2	$0.0019	1.3m	76%	88	85	82	80	79	83%
72	Qwen 3.5 Plus (2026-02-15)	$0.0071	31.1s	74%	91	86	83	77	76	83%
91	ByteDance Seed 1.6	$0.010	1.8m	73%	89	88	81	79	76	83%
54	Qwen 3.5 122B	$0.020	45.8s	80%	84	83	82	82	81	82%
63	Gemma 4 31B (Reasoning)	$0.0016	1.2m	78%	86	82	81	81	80	82%
59	DeepSeek V3 (2025-03-24)	$0.0017	44.1s	79%	85	83	81	81	80	82%
96	Cohere Command R+ (Aug. 2024)	$0.028	52.8s	71%	89	85	80	80	72	82%
64	Z.AI GLM 4.5 Air	$0.0024	36.5s	76%	87	82	81	79	79	82%
65	Mistral Large 2	$0.017	30.9s	77%	86	83	82	79	77	81%
69	Xiaomi MIMO v2.5	$0.0055	29.8s	76%	85	84	81	81	76	81%
106	Qwen 3.6 27B	$0.036	3.0m	78%	85	82	82	79	78	81%
79	o4 Mini High	$0.026	48.1s	75%	86	81	80	79	79	81%
62	Mistral Large 3	$0.0048	36.1s	78%	84	83	81	80	78	81%
50	Ministral 3 14B	$0.0012	11.9s	79%	83	82	82	80	79	81%
68	Qwen 3 32B	$0.0013	32.4s	76%	85	83	82	80	74	81%
67	ByteDance Seed 1.6 Flash	$0.0015	30.8s	77%	85	82	82	79	76	81%
61	Mistral Small 4	$0.0013	11.7s	77%	84	81	81	81	76	81%
75	Gemini 2.5 Flash	$0.0065	12.1s	74%	85	85	81	77	75	81%
80	Aion 2.0	$0.0079	1.2m	76%	85	83	82	78	76	81%
70	Qwen 3.5 9B	$0.0011	1.1m	78%	83	82	82	79	77	81%
77	GPT-5 Mini	$0.0098	1.1m	77%	84	81	80	79	78	81%
73	Qwen 3.5 35B	$0.018	1.2m	79%	82	82	81	79	79	81%
76	Z.AI GLM 4.7 Flash	$0.0018	1.1m	77%	83	83	83	79	75	80%
112	ByteDance Seed 2.0 Mini	$0.0044	4.4m	80%	81	81	80	80	80	80%
86	Z.AI GLM 4.7	$0.012	1.7m	77%	83	81	80	79	78	80%
81	Gemini 3 Flash (Preview, Reasoning)	$0.012	35.6s	74%	85	82	79	79	76	80%
95	Gemini 3.5 Flash (Reasoning, Minimal)	$0.019	12.1s	71%	87	82	77	77	76	80%
94	DeepSeek V3 (2024-12-26)	$0.0027	1.4m	74%	85	80	79	79	75	80%
105	Claude Haiku 4.5	$0.015	24.9s	70%	89	78	78	77	76	79%
90	Claude Sonnet 5	$0.034	35.0s	75%	85	81	80	77	75	79%
110	GPT-5.2	$0.056	1.5m	75%	83	80	78	78	77	79%
88	GPT-4.1	$0.020	38.4s	75%	81	80	79	79	74	79%
84	Gemini 2.5 Pro	$0.037	35.2s	77%	80	80	79	78	77	79%
103	Gemma 4 31B	$0.0014	1.3m	74%	84	79	79	77	74	79%
121	DeepSeek V3.1	$0.0024	2.3m	73%	83	80	79	78	73	79%
99	Z.AI GLM 4.5	$0.0061	35.2s	72%	83	81	78	78	72	78%
89	Gemini 2.5 Flash (Reasoning)	$0.011	20.8s	74%	82	80	78	77	75	78%
98	Z.AI GLM 4.6	$0.0097	46.5s	74%	82	80	78	76	75	78%
120	Cydonia 24B V4.1	$0.0023	1.1m	70%	85	81	79	77	68	78%
122	ByteDance Seed 2.0 Lite	$0.011	1.7m	72%	84	80	77	75	74	78%
100	GPT-4o, Aug. 6th (temp=1)	$0.020	16.0s	73%	83	80	78	76	73	78%
118	Gemini 3.5 Flash (Reasoning)	$0.086	44.8s	75%	80	80	79	76	75	78%
107	Ministral 8B	$0.0007	20.3s	71%	86	79	78	76	71	78%
119	Gemma 4 26B	$0.0012	43.6s	69%	84	82	81	78	64	78%
97	Mistral NeMO	$0.0008	7.5s	72%	82	82	80	73	70	77%
111	Gemma 3 27B	$0.0010	57.7s	72%	82	81	78	73	73	77%
108	Gemma 3 12B	$0.0004	45.9s	73%	81	79	77	75	74	77%
102	Gemma 4 26B (Reasoning)	$0.0019	49.8s	74%	80	79	78	75	74	77%
115	Gemini 3.1 Flash Lite (Preview)	$0.0037	8.7s	69%	85	79	78	77	67	77%
104	Ministral 3 3B	$0.0006	6.6s	72%	81	79	78	76	70	77%
92	Gemini 3.1 Flash Lite	$0.0036	9.3s	74%	79	77	76	76	75	76%
101	Qwen 2.5 72B	$0.0012	30.2s	74%	78	77	76	75	74	76%
109	GPT-4.1 Mini	$0.0034	20.9s	72%	80	79	76	74	73	76%
125	DeepSeek-V2 Chat	$0.0026	52.7s	68%	85	76	74	73	73	76%
113	Arcee AI: Trinity Mini	$0.0004	9.1s	71%	81	76	75	75	72	76%
116	Gemini 3 Flash (Preview)	$0.0089	20.4s	72%	79	78	76	73	72	76%
114	GPT-4o, Aug. 6th (temp=0)	$0.021	12.5s	73%	78	76	75	74	74	75%
117	Ministral 3B	$0.0002	7.0s	72%	77	76	75	74	71	75%
130	Ministral 3 8B	$0.0012	22.3s	66%	80	78	72	71	69	74%
123	Gemini 2.5 Flash Lite	$0.0012	10.8s	70%	78	76	74	73	70	74%
124	Inception Mercury 2	$0.0029	6.1s	70%	76	76	73	72	72	74%
126	GPT-4o Mini (temp=0)	$0.0016	32.9s	69%	77	73	72	72	72	73%
127	GPT-4o Mini (temp=1)	$0.0014	39.7s	70%	77	75	74	72	69	73%
129	Llama 3.1 70B	$0.0025	16.4s	68%	77	76	76	68	65	72%
134	GPT-5 Nano	$0.0042	1.5m	66%	78	73	71	69	67	72%
128	Gemma 3 4B	$0.0003	21.4s	70%	74	73	72	70	70	72%
131	Gemini 3.1 Flash Lite (Reasoning)	$0.0034	10.1s	68%	73	73	72	72	65	71%
133	GPT-OSS 120B	$0.0010	1.6m	68%	72	71	70	70	68	70%
136	Nemotron 3 Super	$0.0000	40.4s	66%	72	71	68	68	66	69%
132	GPT-4.1 Nano	$0.0010	14.3s	66%	74	70	70	67	65	69%
140	Mistral Small 3.2 24B	$0.0078	6.5m	59%	81	68	67	67	63	69%
138	Gemini 2.5 Flash Lite (Reasoning)	$0.0036	40.3s	61%	78	71	69	64	62	69%
139	Nemotron 3 Nano	$0.0016	1.7m	64%	73	73	71	65	61	69%
81.78%

Median	Evaluator	Top 3	Flop 3
100.0%	"Not X but Y" pattern overuse	100DeepSeek V3 (2024-12-26) 100GPT-4o, Aug. 6th (temp=0) 100GPT-5.5	45GPT-5 Nano 50Aion 2.0 56Gemini 2.5 Flash Lite (Reasoning)
48.7%	Adverb-first sentence starts	100GPT-5.4 Nano (Reasoning, Low) 97Writer: Palmyra X5 96Gemma 3 12B	0Gemini 3.1 Flash Lite (Reasoning) 0Gemini 3.1 Pro (Preview) 0Inception Mercury 2
100.0%	Adverbs in dialogue tags	100Grok 4.3 (Reasoning) 100Gemini 3 Flash (Preview) 100GPT-4.1	19GPT-4.1 Nano 26Cydonia 24B V4.1 48Llama 3.1 70B
89.8%	AI-ism adverb frequency	99o4 Mini 99Qwen 3.6 Flash 99GPT-5.5 (Reasoning)	52Gemma 3 4B 54Cydonia 24B V4.1 67GPT-4.1 Nano
100.0%	AI-ism character names	100GPT-5.4 Nano (Reasoning) 100DeepSeek-V2 Chat 100Grok 4.3 (Reasoning)	60Claude Opus 4 88Z.AI GLM 5 92Claude Sonnet 4
100.0%	AI-ism location names	100GPT-5 100Cydonia 24B V4.1 100DeepSeek V4 Flash (Reasoning)	96Gemini 2.5 Pro 96Z.AI GLM 4.5 Air
53.3%	AI-ism word frequency	91Claude Sonnet 4.6 (Reasoning) 89Claude Opus 4.7 83Claude Opus 4.8 (Reasoning, Low)	0Gemma 3 4B 0GPT-4o Mini (temp=0) 0Inception Mercury 2
93.3%	Cliché density	100Gemini 3.1 Flash Lite (Preview) 100MoonshotAI: Kimi K2.5 100Mistral Small 4 (Reasoning)	13Mistral Small 3.2 24B 13GPT-4o Mini (temp=0) 47Gemini 2.5 Flash Lite
96.0%	Dialogue tag variety (said vs. fancy)	100Claude Sonnet 4.6 100GPT-5.1 100GPT-5.4	11Gemini 2.5 Flash Lite 18GPT-OSS 120B 24GPT-4o, Aug. 6th (temp=1)
94.9%	Em-dash & semicolon overuse	100Claude Sonnet 5 100Z.AI GLM 5 Turbo 100GPT-4o, Aug. 6th (temp=0)	0GPT-4o, Aug. 6th (temp=1) 0Mistral Small 4 0GPT-4.1 Nano
100.0%	Emotion telling (show vs. tell)	100GPT-4o, Aug. 6th (temp=1) 100o4 Mini High 100DeepSeek V4 Flash (Reasoning)	87Mistral Small 3.2 24B 88GPT-4o Mini (temp=0) 97Qwen 2.5 72B
99.2%	Filter word density	100Z.AI GLM 5.2 (Reasoning, High) 100GPT-5.5 (Reasoning, Low) 100GPT-5.4 (Reasoning)	34ByteDance Seed 2.0 Mini 52Nemotron 3 Nano 62Gemini 3.1 Flash Lite (Reasoning)
100.0%	Gibberish response detection	100Qwen3 235B A22B Instruct 2507 100GPT-5.5 (Reasoning) 100WizardLM 2 8x22b	80Llama 3.1 70B 91Cydonia 24B V4.1 99DeepSeek V3 (2025-03-24)
100.0%	Markdown formatting overuse	100Mistral NeMO 100Claude Opus 4 100Ministral 3 14B	80Llama 3.1 70B 85Qwen 3 32B 93Cydonia 24B V4.1
100.0%	Missing dialogue indicators (quotation marks)	100Z.AI GLM 5 100GPT-5 Nano 100Qwen3.6 Max Preview	63Qwen 3.5 Flash 80Qwen 3.5 Plus (2026-02-15) 89Mistral Small 3.2 24B
86.1%	Name drop frequency	100Claude Opus 4.8 (Reasoning, Low) 100Claude Sonnet 5 (Reasoning) 100Gemini 3.1 Flash Lite (Preview)	9GPT-5.2 9Qwen 3.5 9B 17GPT-5.4 Nano (Reasoning)
86.8%	Narrator intent-glossing	100Z.AI GLM 4.7 100Gemini 3.5 Flash (Reasoning) 100Gemini 3.1 Flash Lite	22Nemotron 3 Super 25Claude Sonnet 5 26Claude Haiku 4.5
100.0%	Overuse of "that" (subordinate clause padding)	100Claude Sonnet 4.5 100GPT-5.4 100GPT-5.4 (Reasoning, Low)	49Mistral Small 3.2 24B 64ByteDance Seed 2.0 Mini 79Claude Haiku 4.5
100.0%	Paragraph length variance	100Claude Opus 4.6 100Gemini 2.5 Flash 100Aion 2.0	64Nemotron 3 Nano 67GPT-OSS 120B 67Inception Mercury 2
99.7%	Passive voice overuse	100Gemini 3.1 Pro (Preview) 100Qwen3.7 Max 100GPT-5.4 Mini	80ByteDance Seed 2.0 Lite 88Qwen 3.5 35B 93Hermes 3 70B
100.0%	Past progressive (was/were + -ing) overuse	100Mistral Large 2 100Qwen3.7 Max 100GPT-5.4 Mini (Reasoning, Low)	60Llama 3.1 70B 60Z.AI GLM 4.6 61Mistral Small 3.2 24B
47.7%	Pronoun-first sentence starts	100GPT-5.5 (Reasoning, Low) 100GPT-5.5 98Llama 3.1 70B	0Gemini 3.1 Flash Lite (Preview) 0Gemini 3.1 Flash Lite 0ByteDance Seed 2.0 Lite
96.9%	Purple prose (modifier overload)	100Inception Mercury 2 100Qwen 3.5 Flash 100Grok 4.3 (Reasoning)	58Cydonia 24B V4.1 69Gemma 3 4B 84Gemini 3.5 Flash (Reasoning)
100.0%	Repeated phrase echo	100Claude Opus 4 100Aion 3.0 100Aion 2.0	—
100.0%	Sentence length variance	100MoonshotAI: Kimi K2.5 100GPT-4o Mini (temp=1) 100DeepSeek V4 Pro	74Mistral Small 3.2 24B 92Nemotron 3 Nano 94Qwen 3.5 35B
55.3%	Sentence opener variety	97Cydonia 24B V4.1 92GPT-4o, Aug. 6th (temp=1) 86GPT-4o Mini (temp=1)	32Qwen 3.5 35B 37Gemini 3 Flash (Preview, Reasoning) 37Qwen 3.5 Flash
21.2%	Subject-first sentence starts	89Cydonia 24B V4.1 79Writer: Palmyra X5 77Hermes 3 70B	0Gemma 4 26B 0Inception Mercury 2 0Qwen 3.5 9B
20.0%	Subordinate conjunction sentence starts	90Cohere Command R+ (Aug. 2024) 78DeepSeek V3.2 77Gemma 3 4B	0DeepSeek V3 (2025-03-24) 0GPT-5.1 0Aion 3.0 Mini
81.6%	Technical jargon density	100Qwen 3.5 397B A17B 100o4 Mini High 100Qwen3.6 Max Preview	0GPT-4o Mini (temp=1) 2Ministral 3B 7ByteDance Seed 2.0 Lite
72.2%	Useless dialogue additions	100Qwen 3.5 35B 100DeepSeek V4 Flash (Reasoning) 100GPT-5.4 (Reasoning, Low)	0Mistral NeMO 0GPT-4o Mini (temp=0) 0Inception Mercury 2

Bad Writing Habits

Romance: separated couple reunites

Performance Score Distribution (Top 20)

Price-Performance Score Distribution (Top 20)

Most Stable Models (Top 20)

Top Overall Models (Top 20)