mistralai/mistral-small-2603

Mistral Small 4

Release Date

Mar 16th, 2026

Context Size

265k

Reasoning

No

Benchmark Cost

$0.47

Speed

104.5 tok/s

Categories

20%40%60%80%100%Creative Writing81.1%Tooling95.0%Language52.0%Utility78.3%Reasoning78.7%Text Editing91.0%Rule Following62.2%Hallucination73.4%

Subcategories

20%40%60%80%100%AI-ismsProse VarietyDialoguePurple ProseMechanical StyleClichésXMLComprehensionGenerationWord CountingSentence CountingParagraph CountingStructural CountingData ExtractionDeductionAttentionTransformationPreservationStructural IntegrityConstraint AdherenceFalse PositivesContent InventionOutput Corruption

Bad Writing Habits

Detects common prose quality anti-patterns in AI-generated creative writing, including passive voice, past progressive overuse, weak dialogue tags, filter words, purple prose, cliches, AI-ism words/adverbs/names, and more.

Scenario #1 #2 #3 #4 #5 Total
Detailed Writing Rules
797775737075%
858181797781%
908584817784%
868483827783%
848181817681%
878584807783%
Detailed Writing Rules80.80%
genre
817978766776%
868482818083%
888281797881%
848180787479%
838282827881%
878381786979%
genre80.05%
Novelcrafter Default Prompt
817977767377%
838379797880%
888786858386%
868481777480%
888484838284%
888584797883%
Novelcrafter Default Prompt81.80%
80.88%

Codex Extraction

Evaluates a model's ability to extract structured codex entries (characters, locations, objects, lore) from prose passages and return them as well-formed XML.

Scenario #1 #2 #3 #4 #5 Total
929291898890%
949289868389%
928985838386%
938382817883%
87.23%

Text Replacement

Tests deterministic text transformations: renaming characters/locations, expanding contractions, tense rewriting, POV shifts, gender swaps, combined transformations, and word avoidance. Scored by checking each expected change independently.

Scenario #1 #2 #3 #4 #5 #6 #7 Total
Generic Prompt
100100100100100100100100%
100100100100100100100100%
9898989896673885%
10099999999999999%
100100100100100969699%
8585787878787479%
9089898887868588%
10010010010094734087%
10099999999999699%
Generic Prompt92.81%
Specific Prompt
1001001001001001009499%
100100100100100100100100%
9999999998929297%
100100100100100100100100%
100100100100100100100100%
100100100100100100100100%
9289898989888789%
100100100100100100100100%
1001001009999999999%
Specific Prompt98.29%
95.55%

Tool usage within Novelcrafter

Output messages that are related to tool usage within Novelcrafter

Scenario #1 #2 #3 #4 #5 #6 #7 #8 #9 #10 Total
1001001001001001001001001003393%
93.33%