nousresearch/hermes-3-llama-3.1-70b

Hermes 3 70B

Release Date

Aug 15th, 2024

Context Size

128k

Reasoning

No

Benchmark Cost

$1.08

Speed

28.9 tok/s

Categories

20%40%60%80%100%Creative Writing77.4%Tooling97.9%Language81.7%Utility61.1%Reasoning79.1%Text Editing63.3%Rule Following53.0%Hallucination67.1%

Subcategories

20%40%60%80%100%AI-ismsProse VarietyDialoguePurple ProseMechanical StyleClichésXMLComprehensionGenerationWord CountingSentence CountingParagraph CountingStructural CountingData ExtractionDeductionAttentionTransformationPreservationStructural IntegrityConstraint AdherenceFalse PositivesContent InventionOutput Corruption

Bad Writing Habits

Detects common prose quality anti-patterns in AI-generated creative writing, including passive voice, past progressive overuse, weak dialogue tags, filter words, purple prose, cliches, AI-ism words/adverbs/names, and more.

Scenario #1 #2 #3 #4 #5 Total
Detailed Writing Rules
858279787379%
818080737077%
868480726778%
959182807585%
878584837683%
858584787682%
Detailed Writing Rules80.54%
genre
817372716873%
918574696978%
797975737175%
848271716975%
818080746576%
868579767280%
genre76.14%
Novelcrafter Default Prompt
757471707072%
898585716579%
898379737379%
818079757578%
827978757177%
847978767278%
Novelcrafter Default Prompt77.12%
77.93%

Codex Extraction

Evaluates a model's ability to extract structured codex entries (characters, locations, objects, lore) from prose passages and return them as well-formed XML.

Scenario #1 #2 #3 #4 #5 Total
929090908189%
908988878688%
898584847583%
979592898592%
87.92%

Text Replacement

Tests deterministic text transformations: renaming characters/locations, expanding contractions, tense rewriting, POV shifts, gender swaps, combined transformations, and word avoidance. Scored by checking each expected change independently.

Scenario #1 #2 #3 #4 #5 #6 #7 Total
Generic Prompt
100100100100100100100100%
100100100100100100100100%
9999999999999899%
100100100100100100100100%
10010010010096969698%
100100747474747481%
8982818179785978%
10010010010099998798%
9796969693939295%
Generic Prompt94.30%
Specific Prompt
1001001001001001003691%
1001001001001000071%
9050000014%
10000000014%
1001001005400051%
1001001005200050%
92910000026%
1001003333332043%
100100100000043%
Specific Prompt44.79%
69.55%

Tool usage within Novelcrafter

Output messages that are related to tool usage within Novelcrafter

Scenario #1 #2 #3 #4 #5 #6 #7 #8 #9 #10 Total
100100100100100100100100100100100%
100.00%