openai/gpt-5.4

GPT-5.4 (Reasoning)

Release Date

Mar 5th, 2026

Context Size

1m

Reasoning

Yes

Benchmark Cost

$9.43

Speed

51.2 tok/s

Categories

20%40%60%80%100%Creative Writing90.9%Tooling99.7%Language80.7%Utility81.3%Reasoning94.2%Text Editing97.2%Rule Following58.1%Hallucination73.7%

Subcategories

20%40%60%80%100%AI-ismsProse VarietyDialoguePurple ProseMechanical StyleClichésXMLComprehensionGenerationWord CountingSentence CountingParagraph CountingStructural CountingData ExtractionDeductionAttentionTransformationPreservationStructural IntegrityConstraint AdherenceFalse PositivesContent InventionOutput Corruption

Bad Writing Habits

Detects common prose quality anti-patterns in AI-generated creative writing, including passive voice, past progressive overuse, weak dialogue tags, filter words, purple prose, cliches, AI-ism words/adverbs/names, and more.

Scenario #1 #2 #3 #4 #5 Total
Detailed Writing Rules
929190888689%
979190909092%
949290909091%
949291909092%
919191888890%
939291909091%
Detailed Writing Rules90.81%
genre
898886868386%
888786858286%
929089898589%
939292919091%
919090898889%
969593908692%
genre88.94%
Novelcrafter Default Prompt
919090898890%
939392919092%
908988888688%
949391918992%
929089898990%
969291918892%
Novelcrafter Default Prompt90.44%
90.06%

Codex Extraction

Evaluates a model's ability to extract structured codex entries (characters, locations, objects, lore) from prose passages and return them as well-formed XML.

Scenario #1 #2 #3 #4 #5 Total
969595959495%
949393939293%
989797979797%
888686858586%
92.86%

Text Replacement

Tests deterministic text transformations: renaming characters/locations, expanding contractions, tense rewriting, POV shifts, gender swaps, combined transformations, and word avoidance. Scored by checking each expected change independently.

Scenario #1 #2 #3 #4 #5 #6 #7 Total
Generic Prompt
1001001001001001008998%
100100100100100100100100%
9999999999999999%
10010010010010010099100%
100100100100100100100100%
100100100100100999799%
10096969594939395%
100100100100100100100100%
1001001009695898996%
Generic Prompt98.67%
Specific Prompt
10089898989898990%
100100100100100100100100%
9999999999999899%
100100100100100100100100%
100100100100100100100100%
10010010010010010099100%
9898989897969697%
100100100100100100100100%
100100100100100100100100%
Specific Prompt98.52%
98.59%

Tool usage within Novelcrafter

Output messages that are related to tool usage within Novelcrafter

Scenario #1 #2 #3 #4 #5 #6 #7 #8 #9 #10 Total
100100100100100100100100100100100%
100.00%