inception/mercury

Inception Mercury

Release Date

Jun 25th, 2025

Context Size

128k

Reasoning

No

Benchmark Cost

$1.23

Speed

165.8 tok/s

Categories

20%40%60%80%100%Creative Writing70.0%Tooling98.0%Language80.4%Utility87.4%Reasoning86.0%Text Editing79.5%Rule Following39.7%Hallucination95.1%

Subcategories

20%40%60%80%100%AI-ismsProse VarietyDialoguePurple ProseMechanical StyleClichésXMLComprehensionGenerationWord CountingSentence CountingParagraph CountingStructural CountingData ExtractionDeductionAttentionTransformationPreservationStructural IntegrityConstraint AdherenceFalse PositivesContent InventionOutput Corruption

Bad Writing Habits

Detects common prose quality anti-patterns in AI-generated creative writing, including passive voice, past progressive overuse, weak dialogue tags, filter words, purple prose, cliches, AI-ism words/adverbs/names, and more.

Scenario #1 #2 #3 #4 #5 Total
Detailed Writing Rules
686868656567%
676562616163%
747367676469%
757269666068%
777470686571%
767171696771%
Detailed Writing Rules68.11%
genre
636358565559%
676259555560%
747272666670%
696865636165%
747166626167%
717171666469%
genre64.87%
Novelcrafter Default Prompt
686766646366%
766664646166%
797675756674%
797875735973%
777473706672%
717166646167%
Novelcrafter Default Prompt69.53%
67.50%

Codex Extraction

Evaluates a model's ability to extract structured codex entries (characters, locations, objects, lore) from prose passages and return them as well-formed XML.

Scenario #1 #2 #3 #4 #5 Total
888683818184%
909084838085%
878685847583%
908987828186%
84.53%

Text Replacement

Tests deterministic text transformations: renaming characters/locations, expanding contractions, tense rewriting, POV shifts, gender swaps, combined transformations, and word avoidance. Scored by checking each expected change independently.

Scenario #1 #2 #3 #4 #5 #6 #7 Total
Generic Prompt
10010010010089787892%
100100100100100100100100%
9695959594949194%
9999989898979698%
100100100100100100100100%
10095959594928193%
7978777675676073%
10010010010010010098100%
9287848181807282%
Generic Prompt92.54%
Specific Prompt
100100100100100897895%
100100100100100100100100%
9492929286828088%
9898989896969597%
100100100100100100100100%
9393919087782680%
9595888852514073%
10010010010010010099100%
9391908885858288%
Specific Prompt91.20%
91.87%

Tool usage within Novelcrafter

Output messages that are related to tool usage within Novelcrafter

Scenario #1 #2 #3 #4 #5 #6 #7 #8 #9 #10 Total
1001001001001001001001001006797%
96.67%