openai/gpt-4.1-mini

GPT-4.1 Mini

Release Date

Apr 14th, 2025

Context Size

1m

Reasoning

No

Benchmark Cost

$1.01

Speed

129.8 tok/s

Categories

20%40%60%80%100%Creative Writing74.5%Tooling97.9%Language89.6%Utility82.3%Reasoning85.8%Text Editing95.6%Rule Following58.6%Hallucination81.1%

Subcategories

20%40%60%80%100%AI-ismsProse VarietyDialoguePurple ProseMechanical StyleClichésXMLComprehensionGenerationWord CountingSentence CountingParagraph CountingStructural CountingData ExtractionDeductionAttentionTransformationPreservationStructural IntegrityConstraint AdherenceFalse PositivesContent InventionOutput Corruption

Bad Writing Habits

Detects common prose quality anti-patterns in AI-generated creative writing, including passive voice, past progressive overuse, weak dialogue tags, filter words, purple prose, cliches, AI-ism words/adverbs/names, and more.

Scenario #1 #2 #3 #4 #5 Total
Detailed Writing Rules
767574707073%
817978787478%
827976757577%
807574747275%
807976747376%
837877777278%
Detailed Writing Rules76.19%
genre
787271686671%
797876716774%
737270676770%
787674707074%
737271706971%
868180797981%
genre73.38%
Novelcrafter Default Prompt
767370666269%
818075757477%
817575716774%
797877767577%
787673706773%
837877757578%
Novelcrafter Default Prompt74.63%
74.73%

Codex Extraction

Evaluates a model's ability to extract structured codex entries (characters, locations, objects, lore) from prose passages and return them as well-formed XML.

Scenario #1 #2 #3 #4 #5 Total
949291919092%
929290908991%
929292919192%
949191868088%
90.66%

Text Replacement

Tests deterministic text transformations: renaming characters/locations, expanding contractions, tense rewriting, POV shifts, gender swaps, combined transformations, and word avoidance. Scored by checking each expected change independently.

Scenario #1 #2 #3 #4 #5 #6 #7 Total
Generic Prompt
100100100100100100100100%
100100100100100100100100%
9999999999999899%
9999999999989899%
100100100100100100100100%
100100100100100100100100%
9292919191898790%
100100100100100979599%
9595929292928892%
Generic Prompt97.73%
Specific Prompt
100100100100100100100100%
100100100100100100100100%
10099999999969699%
100100100100100100100100%
100100100100100100100100%
1001001001001009999100%
9594949292929193%
100100100100100100100100%
100100100100100100100100%
Specific Prompt99.03%
98.38%

Tool usage within Novelcrafter

Output messages that are related to tool usage within Novelcrafter

Scenario #1 #2 #3 #4 #5 #6 #7 #8 #9 #10 Total
1001001001001001001001001006797%
96.67%