meta-llama/llama-3.1-70b-instruct

Llama 3.1 70B

Release Date

Jul 23rd, 2024

Context Size

128k

Reasoning

No

Benchmark Cost

$1.19

Speed

1021.7 tok/s

Categories

20%40%60%80%100%Creative Writing72.8%Tooling88.6%Language80.2%Utility81.0%Reasoning79.3%Text Editing92.1%Rule Following63.5%Hallucination69.8%

Subcategories

20%40%60%80%100%AI-ismsProse VarietyDialoguePurple ProseMechanical StyleClichésXMLComprehensionGenerationWord CountingSentence CountingParagraph CountingStructural CountingData ExtractionDeductionAttentionTransformationPreservationStructural IntegrityConstraint AdherenceFalse PositivesContent InventionOutput Corruption

Bad Writing Habits

Detects common prose quality anti-patterns in AI-generated creative writing, including passive voice, past progressive overuse, weak dialogue tags, filter words, purple prose, cliches, AI-ism words/adverbs/names, and more.

Scenario #1 #2 #3 #4 #5 Total
Detailed Writing Rules
757169675868%
817575656372%
817676716875%
807876736574%
777676686572%
777675757275%
Detailed Writing Rules72.62%
genre
716665635865%
847769686472%
887474726775%
807875747476%
777774737375%
817979746275%
genre73.01%
Novelcrafter Default Prompt
777673696873%
767171676570%
797272716472%
838079756276%
807668676471%
797772696572%
Novelcrafter Default Prompt72.27%
72.64%

Codex Extraction

Evaluates a model's ability to extract structured codex entries (characters, locations, objects, lore) from prose passages and return them as well-formed XML.

Scenario #1 #2 #3 #4 #5 Total
929090868489%
919089868688%
87858484368%
929190898790%
83.82%

Text Replacement

Tests deterministic text transformations: renaming characters/locations, expanding contractions, tense rewriting, POV shifts, gender swaps, combined transformations, and word avoidance. Scored by checking each expected change independently.

Scenario #1 #2 #3 #4 #5 #6 #7 Total
Generic Prompt
100100100100100100100100%
100100100100100100100100%
9890888783838387%
10010010010010010099100%
100100100100100100100100%
1001001001001001007496%
9190878080787483%
10099999795919196%
9797979696969596%
Generic Prompt95.41%
Specific Prompt
100100100100100868396%
100100100100100100100100%
9999999999978897%
10010010010010010099100%
100100100100100100100100%
100100100100100100100100%
9392898887878589%
100100100100100100100100%
100100100100100100100100%
Specific Prompt97.92%
96.66%

Tool usage within Novelcrafter

Output messages that are related to tool usage within Novelcrafter

Scenario #1 #2 #3 #4 #5 #6 #7 #8 #9 #10 Total
10010010010010010010010067087%
86.67%