openai/gpt-5.4

GPT-5.4 (Reasoning, Low)

Release Date

Mar 5th, 2026

Context Size

1m

Reasoning

Yes

Benchmark Cost

$12.23

Speed

53.8 tok/s

Categories

20%40%60%80%100%Creative Writing90.5%Tooling100.0%Language90.8%Utility95.3%Reasoning94.3%Text Editing98.0%Rule Following70.0%Hallucination92.3%

Subcategories

20%40%60%80%100%AI-ismsProse VarietyDialoguePurple ProseMechanical StyleClichésXMLComprehensionGenerationWord CountingSentence CountingParagraph CountingStructural CountingData ExtractionDeductionAttentionTransformationPreservationStructural IntegrityConstraint AdherenceFalse PositivesContent InventionOutput Corruption

Bad Writing Habits

Detects common prose quality anti-patterns in AI-generated creative writing, including passive voice, past progressive overuse, weak dialogue tags, filter words, purple prose, cliches, AI-ism words/adverbs/names, and more.

Scenario #1 #2 #3 #4 #5 Total
Detailed Writing Rules
939291868389%
929291898690%
908988878588%
959493939293%
959493919092%
939392908991%
Detailed Writing Rules90.63%
genre
888786858586%
908986858286%
929188878689%
939392898591%
929191908890%
949492888691%
genre88.92%
Novelcrafter Default Prompt
919190898890%
939190898990%
898786868586%
929089888789%
919090908990%
969491919193%
Novelcrafter Default Prompt89.79%
89.78%

Codex Extraction

Evaluates a model's ability to extract structured codex entries (characters, locations, objects, lore) from prose passages and return them as well-formed XML.

Scenario #1 #2 #3 #4 #5 Total
979797969195%
969695939295%
999898979798%
949393939393%
95.21%

Text Replacement

Tests deterministic text transformations: renaming characters/locations, expanding contractions, tense rewriting, POV shifts, gender swaps, combined transformations, and word avoidance. Scored by checking each expected change independently.

Scenario #1 #2 #3 #4 #5 #6 #7 Total
Generic Prompt
100100100100100100100100%
100100100100100100100100%
9999999593909095%
100100100100999999100%
100100100100100100100100%
100100100100100100100100%
9898979796969697%
100100100100100100100100%
10099999696918896%
Generic Prompt98.58%
Specific Prompt
1001001008989898994%
100100100100100100100100%
10099999999999999%
100100100100999999100%
100100100100100100100100%
100100100100100100100100%
9998979696969697%
100100100100100100100100%
100100100100100100100100%
Specific Prompt98.88%
98.73%

Tool usage within Novelcrafter

Output messages that are related to tool usage within Novelcrafter

Scenario #1 #2 #3 #4 #5 #6 #7 #8 #9 #10 Total
100100100100100100100100100100100%
100.00%