nousresearch/hermes-3-llama-3.1-405b

Hermes 3 405B

Release Date

Aug 15th, 2024

Context Size

128k

Reasoning

No

Benchmark Cost

$2.57

Speed

19.3 tok/s

Categories

20%40%60%80%100%Creative Writing80.9%Tooling99.8%Language99.6%Utility69.0%Reasoning85.6%Text Editing89.1%Rule Following59.2%Hallucination79.7%

Subcategories

20%40%60%80%100%AI-ismsProse VarietyDialoguePurple ProseMechanical StyleClichésXMLComprehensionGenerationWord CountingSentence CountingParagraph CountingStructural CountingData ExtractionDeductionAttentionTransformationPreservationStructural IntegrityConstraint AdherenceFalse PositivesContent InventionOutput Corruption

Bad Writing Habits

Detects common prose quality anti-patterns in AI-generated creative writing, including passive voice, past progressive overuse, weak dialogue tags, filter words, purple prose, cliches, AI-ism words/adverbs/names, and more.

Scenario #1 #2 #3 #4 #5 Total
Detailed Writing Rules
877776737277%
868686806981%
938686847685%
918883827984%
888787817684%
908985807884%
Detailed Writing Rules82.61%
genre
858580787380%
857877716976%
838078767378%
888682787682%
828180797780%
888383767581%
genre79.52%
Novelcrafter Default Prompt
887876726876%
878683807582%
888383817382%
888584797482%
878576756878%
797876737075%
Novelcrafter Default Prompt79.28%
80.47%

Codex Extraction

Evaluates a model's ability to extract structured codex entries (characters, locations, objects, lore) from prose passages and return them as well-formed XML.

Scenario #1 #2 #3 #4 #5 Total
949393929093%
949292898991%
949393939093%
999999989899%
93.74%

Text Replacement

Tests deterministic text transformations: renaming characters/locations, expanding contractions, tense rewriting, POV shifts, gender swaps, combined transformations, and word avoidance. Scored by checking each expected change independently.

Scenario #1 #2 #3 #4 #5 #6 #7 Total
Generic Prompt
100100100100100100100100%
100100100100100100100100%
9999999999999999%
100100100100100100100100%
100100100100100969699%
100100100100100100100100%
7978787876767577%
1001001001001001006895%
9393939393898591%
Generic Prompt95.74%
Specific Prompt
100100100100100100100100%
1001001001001001003390%
9999999999999999%
100100100100100100100100%
100100100100100100100100%
100100100100100100100100%
9898979695947894%
100100100100100100100100%
100100100100100979799%
Specific Prompt98.00%
96.87%

Tool usage within Novelcrafter

Output messages that are related to tool usage within Novelcrafter

Scenario #1 #2 #3 #4 #5 #6 #7 #8 #9 #10 Total
100100100100100100100100100100100%
100.00%