openai/gpt-5.4-mini

GPT-5.4 Mini (Reasoning, Low)

Release Date

Mar 17th, 2026

Context Size

400k

Reasoning

Yes

Benchmark Cost

$3.31

Speed

130.4 tok/s

Categories

20%40%60%80%100%Creative Writing87.7%Tooling100.0%Language92.4%Utility88.5%Reasoning92.3%Text Editing92.6%Rule Following34.0%Hallucination98.4%

Subcategories

20%40%60%80%100%AI-ismsProse VarietyDialoguePurple ProseMechanical StyleClichésXMLComprehensionGenerationWord CountingSentence CountingParagraph CountingStructural CountingData ExtractionDeductionAttentionTransformationPreservationStructural IntegrityConstraint AdherenceFalse PositivesContent InventionOutput Corruption

Bad Writing Habits

Detects common prose quality anti-patterns in AI-generated creative writing, including passive voice, past progressive overuse, weak dialogue tags, filter words, purple prose, cliches, AI-ism words/adverbs/names, and more.

Scenario #1 #2 #3 #4 #5 Total
Detailed Writing Rules
908987868587%
928987878788%
918585858486%
918988878488%
919088888588%
919089888689%
Detailed Writing Rules87.80%
genre
858483828083%
888886858286%
908887868687%
898887878487%
919089898990%
918888888287%
genre86.63%
Novelcrafter Default Prompt
898684837984%
908685858486%
878685847984%
898886868487%
888786868486%
918888878788%
Novelcrafter Default Prompt85.89%
86.78%

Codex Extraction

Evaluates a model's ability to extract structured codex entries (characters, locations, objects, lore) from prose passages and return them as well-formed XML.

Scenario #1 #2 #3 #4 #5 Total
969695959595%
969594949394%
949190898891%
929089888789%
92.42%

Text Replacement

Tests deterministic text transformations: renaming characters/locations, expanding contractions, tense rewriting, POV shifts, gender swaps, combined transformations, and word avoidance. Scored by checking each expected change independently.

Scenario #1 #2 #3 #4 #5 #6 #7 Total
Generic Prompt
100100100100100898997%
100100100100100100100100%
9999999999989598%
9190898988878488%
100100100100100100100100%
10010010010074747489%
9493919089898891%
100100100100100100100100%
9696969696959195%
Generic Prompt95.34%
Specific Prompt
10010010010089898995%
100100100100100100100100%
9999989897969698%
10010010010099988397%
100100100100100100100100%
100100959595949496%
9494939292928992%
100100100100100100100100%
10010010010098989799%
Specific Prompt97.54%
96.44%

Tool usage within Novelcrafter

Output messages that are related to tool usage within Novelcrafter

Scenario #1 #2 #3 #4 #5 #6 #7 #8 #9 #10 Total
100100100100100100100100100100100%
100.00%