openai/gpt-5-mini

GPT-5 Mini

Release Date

Aug 7th, 2025

Context Size

400k

Reasoning

Yes

Benchmark Cost

$4.47

Speed

78.6 tok/s

Categories

20%40%60%80%100%Creative Writing80.5%Tooling100.0%Language96.5%Utility98.4%Reasoning94.4%Text Editing97.1%Rule Following76.4%Hallucination97.7%

Subcategories

20%40%60%80%100%AI-ismsProse VarietyDialoguePurple ProseMechanical StyleClichésXMLComprehensionGenerationWord CountingSentence CountingParagraph CountingStructural CountingData ExtractionDeductionAttentionTransformationPreservationStructural IntegrityConstraint AdherenceFalse PositivesContent InventionOutput Corruption

Bad Writing Habits

Detects common prose quality anti-patterns in AI-generated creative writing, including passive voice, past progressive overuse, weak dialogue tags, filter words, purple prose, cliches, AI-ism words/adverbs/names, and more.

Scenario #1 #2 #3 #4 #5 Total
Detailed Writing Rules
827776747477%
818180757278%
838282817881%
848383797781%
848180797881%
858382828082%
Detailed Writing Rules79.98%
genre
807876767577%
818077747277%
787876767376%
847979797980%
828075747377%
838277767579%
genre77.60%
Novelcrafter Default Prompt
807975726775%
807977747276%
818080797579%
828281797881%
817979797779%
838381817681%
Novelcrafter Default Prompt78.39%
78.66%

Codex Extraction

Evaluates a model's ability to extract structured codex entries (characters, locations, objects, lore) from prose passages and return them as well-formed XML.

Scenario #1 #2 #3 #4 #5 Total
979696959496%
979796959496%
989594949395%
989595929094%
95.04%

Text Replacement

Tests deterministic text transformations: renaming characters/locations, expanding contractions, tense rewriting, POV shifts, gender swaps, combined transformations, and word avoidance. Scored by checking each expected change independently.

Scenario #1 #2 #3 #4 #5 #6 #7 Total
Generic Prompt
100100100100100100100100%
100100100100100100100100%
9999999998989698%
100100100100999998100%
100100100100100100100100%
1001001009999977496%
9696959595939395%
10010010010010010099100%
9795939190898992%
Generic Prompt97.80%
Specific Prompt
100100100100100100100100%
100100100100100100100100%
100100100100100100100100%
10010010010010010099100%
100100100100100100100100%
10010010010010010098100%
9998989896959597%
100100100100100100100100%
100100100100100100100100%
Specific Prompt99.64%
98.72%

Tool usage within Novelcrafter

Output messages that are related to tool usage within Novelcrafter

Scenario #1 #2 #3 #4 #5 #6 #7 #8 #9 #10 Total
100100100100100100100100100100100%
100.00%