mistral-large-2407

Mistral Large 2
via mistral

Release Date

Jul 24th, 2024

Context Size

128k

Reasoning

No

Benchmark Cost

$8.79

Speed

Categories

20%40%60%80%100%Creative Writing81.9%Tooling99.8%Language85.2%Utility69.2%Reasoning88.2%Text Editing94.2%Rule Following63.1%Hallucination77.9%

Subcategories

20%40%60%80%100%AI-ismsProse VarietyDialoguePurple ProseMechanical StyleClichésXMLComprehensionGenerationWord CountingSentence CountingParagraph CountingStructural CountingData ExtractionDeductionAttentionTransformationPreservationStructural IntegrityConstraint AdherenceFalse PositivesContent InventionOutput Corruption

Bad Writing Habits

Detects common prose quality anti-patterns in AI-generated creative writing, including passive voice, past progressive overuse, weak dialogue tags, filter words, purple prose, cliches, AI-ism words/adverbs/names, and more.

Scenario #1 #2 #3 #4 #5 Total
Detailed Writing Rules
867777757478%
868484827582%
898686868286%
898483817382%
868382797781%
838180807680%
Detailed Writing Rules81.52%
genre
838176757478%
878180797681%
898580797882%
898479787681%
838281807480%
858479756979%
genre79.95%
Novelcrafter Default Prompt
857976767578%
828180747378%
918989878388%
858484848384%
858583838083%
858481797781%
Novelcrafter Default Prompt82.15%
81.21%

Codex Extraction

Evaluates a model's ability to extract structured codex entries (characters, locations, objects, lore) from prose passages and return them as well-formed XML.

Scenario #1 #2 #3 #4 #5 Total
969696969396%
989897959496%
929191918991%
969595959595%
94.55%

Text Replacement

Tests deterministic text transformations: renaming characters/locations, expanding contractions, tense rewriting, POV shifts, gender swaps, combined transformations, and word avoidance. Scored by checking each expected change independently.

Scenario #1 #2 #3 #4 #5 #6 #7 Total
Generic Prompt
100100100100100100100100%
100100100100100100100100%
9898989898989898%
100100100100100100100100%
100100100100100100100100%
7474747474747474%
9292919191919191%
100100100100100100100100%
9999999999999999%
Generic Prompt95.75%
Specific Prompt
100100100100100100100100%
100100100100100100100100%
9999999999999999%
100100100100100100100100%
100100100100100100100100%
100100100100999999100%
9696969696959596%
100100100100100100100100%
100100100100100100100100%
Specific Prompt99.41%
97.58%

Tool usage within Novelcrafter

Output messages that are related to tool usage within Novelcrafter

Scenario #1 #2 #3 #4 #5 #6 #7 #8 #9 #10 Total
100100100100100100100100100100100%
100.00%