openai/gpt-5.4-mini

GPT-5.4 Mini

Release Date

Mar 17th, 2026

Context Size

400k

Reasoning

No

Benchmark Cost

$2.77

Speed

145.8 tok/s

Categories

20%40%60%80%100%Creative Writing88.1%Tooling99.9%Language88.7%Utility79.4%Reasoning88.0%Text Editing90.6%Rule Following46.3%Hallucination78.4%

Subcategories

20%40%60%80%100%AI-ismsProse VarietyDialoguePurple ProseMechanical StyleClichésXMLComprehensionGenerationWord CountingSentence CountingParagraph CountingStructural CountingData ExtractionDeductionAttentionTransformationPreservationStructural IntegrityConstraint AdherenceFalse PositivesContent InventionOutput Corruption

Bad Writing Habits

Detects common prose quality anti-patterns in AI-generated creative writing, including passive voice, past progressive overuse, weak dialogue tags, filter words, purple prose, cliches, AI-ism words/adverbs/names, and more.

Scenario #1 #2 #3 #4 #5 Total
Detailed Writing Rules
908987848186%
908987858086%
918887868688%
919089898889%
918988888789%
908888888888%
Detailed Writing Rules87.70%
genre
868583838083%
898885858486%
908989868588%
959087868589%
918988868588%
929089888689%
genre87.12%
Novelcrafter Default Prompt
888786868286%
878786838285%
888887868487%
888787868487%
919090898890%
908786858487%
Novelcrafter Default Prompt86.74%
87.19%

Codex Extraction

Evaluates a model's ability to extract structured codex entries (characters, locations, objects, lore) from prose passages and return them as well-formed XML.

Scenario #1 #2 #3 #4 #5 Total
939393929293%
959594939294%
969190898691%
989393929193%
92.51%

Text Replacement

Tests deterministic text transformations: renaming characters/locations, expanding contractions, tense rewriting, POV shifts, gender swaps, combined transformations, and word avoidance. Scored by checking each expected change independently.

Scenario #1 #2 #3 #4 #5 #6 #7 Total
Generic Prompt
100100100100100100100100%
100100100100100100100100%
10096969696969597%
9088888888888888%
100100100100100100100100%
8383838383807481%
9391919090898991%
10010010010010010099100%
9999979696888294%
Generic Prompt94.47%
Specific Prompt
8989898989897887%
100100100100100100100100%
9999989898989898%
10010010010099999799%
100100100100100100100100%
9595959595959395%
9290908989898790%
100100100100100100100100%
100100100100100100100100%
Specific Prompt96.59%
95.53%

Tool usage within Novelcrafter

Output messages that are related to tool usage within Novelcrafter

Scenario #1 #2 #3 #4 #5 #6 #7 #8 #9 #10 Total
100100100100100100100100100100100%
100.00%