qwen/qwen-2.5-72b-instruct

Qwen 2.5 72B

Release Date

Sep 19th, 2024

Context Size

131.1k

Reasoning

No

Benchmark Cost

$0.53

Speed

46.0 tok/s

Categories

20%40%60%80%100%Creative Writing75.2%Tooling99.4%Language69.0%Utility76.4%Reasoning83.4%Text Editing89.2%Rule Following31.6%Hallucination79.6%

Subcategories

20%40%60%80%100%AI-ismsProse VarietyDialoguePurple ProseMechanical StyleClichésXMLComprehensionGenerationWord CountingSentence CountingParagraph CountingStructural CountingData ExtractionDeductionAttentionTransformationPreservationStructural IntegrityConstraint AdherenceFalse PositivesContent InventionOutput Corruption

Bad Writing Habits

Detects common prose quality anti-patterns in AI-generated creative writing, including passive voice, past progressive overuse, weak dialogue tags, filter words, purple prose, cliches, AI-ism words/adverbs/names, and more.

Scenario #1 #2 #3 #4 #5 Total
Detailed Writing Rules
767676756974%
747372676370%
807978787578%
787675737175%
787776757476%
827975737276%
Detailed Writing Rules74.96%
genre
747370706470%
827769696773%
838078737378%
797975717175%
777775757476%
827875727276%
genre74.52%
Novelcrafter Default Prompt
767473727173%
737373726872%
797978777678%
797574726974%
817674747375%
807673727074%
Novelcrafter Default Prompt74.34%
74.61%

Codex Extraction

Evaluates a model's ability to extract structured codex entries (characters, locations, objects, lore) from prose passages and return them as well-formed XML.

Scenario #1 #2 #3 #4 #5 Total
929190908890%
919191909091%
918989888789%
989696928994%
90.96%

Text Replacement

Tests deterministic text transformations: renaming characters/locations, expanding contractions, tense rewriting, POV shifts, gender swaps, combined transformations, and word avoidance. Scored by checking each expected change independently.

Scenario #1 #2 #3 #4 #5 #6 #7 Total
Generic Prompt
1001001001001001008998%
100100100100100100100100%
9999999898989898%
10010010099999999100%
100100969696969697%
10010010010074747489%
9089888885848387%
100100100100100100100100%
10099999999979699%
Generic Prompt96.40%
Specific Prompt
100100100100100100100100%
100100100100100100100100%
9999999999999999%
100100100100100100100100%
100100100100100100100100%
100100100100100100100100%
9290898988888689%
100100100100100100100100%
100100100100100100100100%
Specific Prompt98.66%
97.53%

Tool usage within Novelcrafter

Output messages that are related to tool usage within Novelcrafter

Scenario #1 #2 #3 #4 #5 #6 #7 #8 #9 #10 Total
100100100100100100100100100100100%
100.00%