openai/gpt-5.4-mini

GPT-5.4 Mini (Reasoning)

Release Date

Mar 17th, 2026

Context Size

400k

Reasoning

Yes

Benchmark Cost

$6.92

Speed

132.4 tok/s

Categories

20%40%60%80%100%Creative Writing88.7%Tooling100.0%Language98.1%Utility94.4%Reasoning94.6%Text Editing95.8%Rule Following57.4%Hallucination96.2%

Subcategories

20%40%60%80%100%AI-ismsProse VarietyDialoguePurple ProseMechanical StyleClichésXMLComprehensionGenerationWord CountingSentence CountingParagraph CountingStructural CountingData ExtractionDeductionAttentionTransformationPreservationStructural IntegrityConstraint AdherenceFalse PositivesContent InventionOutput Corruption

Bad Writing Habits

Detects common prose quality anti-patterns in AI-generated creative writing, including passive voice, past progressive overuse, weak dialogue tags, filter words, purple prose, cliches, AI-ism words/adverbs/names, and more.

Scenario #1 #2 #3 #4 #5 Total
Detailed Writing Rules
929088848287%
929291888690%
888787878487%
919191908990%
888886868687%
939190898890%
Detailed Writing Rules88.49%
genre
918783818084%
878686868285%
898787858386%
938988878689%
919190898689%
949292878590%
genre87.40%
Novelcrafter Default Prompt
888784848185%
888786858586%
878584848284%
908989878789%
888887878587%
929290898690%
Novelcrafter Default Prompt86.76%
87.55%

Codex Extraction

Evaluates a model's ability to extract structured codex entries (characters, locations, objects, lore) from prose passages and return them as well-formed XML.

Scenario #1 #2 #3 #4 #5 Total
979797979797%
979795959596%
989696949496%
1009999989798%
96.67%

Text Replacement

Tests deterministic text transformations: renaming characters/locations, expanding contractions, tense rewriting, POV shifts, gender swaps, combined transformations, and word avoidance. Scored by checking each expected change independently.

Scenario #1 #2 #3 #4 #5 #6 #7 Total
Generic Prompt
1001001001001001008998%
100100100100100100100100%
9999999999999999%
1001001001001009999100%
100100100100100100100100%
100100100100100747492%
9393939292919192%
100100100100100100100100%
10099979692918895%
Generic Prompt97.46%
Specific Prompt
1001001001001001006996%
100100100100100100100100%
1001001001001009999100%
1001001009897958296%
100100100100100100100100%
10010010010099999899%
9796969595959495%
100100100100100100100100%
10010010010010010099100%
Specific Prompt98.46%
97.96%

Tool usage within Novelcrafter

Output messages that are related to tool usage within Novelcrafter

Scenario #1 #2 #3 #4 #5 #6 #7 #8 #9 #10 Total
100100100100100100100100100100100%
100.00%