deepseek/deepseek-chat

DeepSeek V3 (2024-12-26)

Release Date

Dec 26th, 2024

Context Size

163.8k

Reasoning

No

Benchmark Cost

$0.89

Speed

26.9 tok/s

Categories

20%40%60%80%100%Creative Writing77.9%Tooling100.0%Language87.9%Utility81.9%Reasoning88.7%Text Editing93.6%Rule Following66.4%Hallucination73.1%

Subcategories

20%40%60%80%100%AI-ismsProse VarietyDialoguePurple ProseMechanical StyleClichésXMLComprehensionGenerationWord CountingSentence CountingParagraph CountingStructural CountingData ExtractionDeductionAttentionTransformationPreservationStructural IntegrityConstraint AdherenceFalse PositivesContent InventionOutput Corruption

Bad Writing Habits

Detects common prose quality anti-patterns in AI-generated creative writing, including passive voice, past progressive overuse, weak dialogue tags, filter words, purple prose, cliches, AI-ism words/adverbs/names, and more.

Scenario #1 #2 #3 #4 #5 Total
Detailed Writing Rules
828180747178%
877979737378%
878580787681%
787777736875%
858079797580%
838277767679%
Detailed Writing Rules78.30%
genre
777269686470%
867573726975%
828279767278%
787775747075%
837573727175%
908080777380%
genre75.50%
Novelcrafter Default Prompt
827674747075%
887979777078%
827878737176%
838180797279%
868580797180%
867876767678%
Novelcrafter Default Prompt77.97%
77.26%

Codex Extraction

Evaluates a model's ability to extract structured codex entries (characters, locations, objects, lore) from prose passages and return them as well-formed XML.

Scenario #1 #2 #3 #4 #5 Total
969493929293%
959491908892%
979695948994%
989797959496%
93.91%

Text Replacement

Tests deterministic text transformations: renaming characters/locations, expanding contractions, tense rewriting, POV shifts, gender swaps, combined transformations, and word avoidance. Scored by checking each expected change independently.

Scenario #1 #2 #3 #4 #5 #6 #7 Total
Generic Prompt
100100100100100100100100%
100100100100100100100100%
9898989898979697%
100100100100100100100100%
100100100100100100100100%
100100747474747481%
9594939393919093%
10010010010010010099100%
9695959588818190%
Generic Prompt95.71%
Specific Prompt
1001001001001001005694%
100100100100100100100100%
999999999999085%
100100100100100100100100%
100100100100100100100100%
100100100100100100100100%
9797979594949395%
100100100100100100100100%
10010010010010010099100%
Specific Prompt97.11%
96.41%

Tool usage within Novelcrafter

Output messages that are related to tool usage within Novelcrafter

Scenario #1 #2 #3 #4 #5 #6 #7 #8 #9 #10 Total
100100100100100100100100100100100%
100.00%