meta-llama/llama-3.1-8b-instruct

Llama 3.1 8B

Release Date

Jul 23rd, 2024

Context Size

128k

Reasoning

No

Benchmark Cost

$0.17

Speed

1795.6 tok/s

Categories

20%40%60%80%100%Creative Writing76.5%Tooling69.5%Language64.1%Utility74.8%Reasoning69.1%Text Editing75.4%Rule Following34.0%Hallucination43.4%

Subcategories

20%40%60%80%100%AI-ismsProse VarietyDialoguePurple ProseMechanical StyleClichésXMLComprehensionGenerationWord CountingSentence CountingParagraph CountingStructural CountingData ExtractionDeductionAttentionTransformationPreservationStructural IntegrityConstraint AdherenceFalse PositivesContent InventionOutput Corruption

Bad Writing Habits

Detects common prose quality anti-patterns in AI-generated creative writing, including passive voice, past progressive overuse, weak dialogue tags, filter words, purple prose, cliches, AI-ism words/adverbs/names, and more.

Scenario #1 #2 #3 #4 #5 Total
Detailed Writing Rules
817971656472%
776968676569%
858079797780%
797674727074%
807978726976%
797978736374%
Detailed Writing Rules74.21%
genre
817674726373%
797675696673%
868073727076%
827770696873%
817878777277%
848281777279%
genre75.31%
Novelcrafter Default Prompt
807876716674%
878483806981%
878481767280%
757575747374%
898274706877%
867675706975%
Novelcrafter Default Prompt76.82%
75.45%

Codex Extraction

Evaluates a model's ability to extract structured codex entries (characters, locations, objects, lore) from prose passages and return them as well-formed XML.

Scenario #1 #2 #3 #4 #5 Total
858382813273%
848177767178%
79696864357%
858079757479%
71.51%

Text Replacement

Tests deterministic text transformations: renaming characters/locations, expanding contractions, tense rewriting, POV shifts, gender swaps, combined transformations, and word avoidance. Scored by checking each expected change independently.

Scenario #1 #2 #3 #4 #5 #6 #7 Total
Generic Prompt
10089898989787587%
100100100100100100100100%
9291878682827785%
10098989897979798%
1001001001001001009299%
8674747474747375%
7971716857544864%
1001001009999989599%
9999969595949496%
Generic Prompt89.22%
Specific Prompt
8986786767645372%
100100100100100100100100%
9999999999999598%
9999999999989899%
100100100100100100100100%
1001001001001009997100%
8681807774494670%
100100100100100100100100%
100100100100100100100100%
Specific Prompt93.23%
91.22%

Tool usage within Novelcrafter

Output messages that are related to tool usage within Novelcrafter

Scenario #1 #2 #3 #4 #5 #6 #7 #8 #9 #10 Total
100100100100100100000060%
60.00%