anthropic/claude-3.7-sonnet

Claude 3.7 Sonnet

Release Date

Feb 19th, 2025

Context Size

200k

Reasoning

No

Benchmark Cost

$18.20

Speed

87.2 tok/s

Categories

20%40%60%80%100%Creative Writing76.3%Tooling99.3%Language92.9%Utility62.5%Reasoning89.9%Text Editing97.1%Rule Following73.8%Hallucination75.2%

Subcategories

20%40%60%80%100%AI-ismsProse VarietyDialoguePurple ProseMechanical StyleClichésXMLComprehensionGenerationWord CountingSentence CountingParagraph CountingStructural CountingData ExtractionDeductionAttentionTransformationPreservationStructural IntegrityConstraint AdherenceFalse PositivesContent InventionOutput Corruption

Bad Writing Habits

Detects common prose quality anti-patterns in AI-generated creative writing, including passive voice, past progressive overuse, weak dialogue tags, filter words, purple prose, cliches, AI-ism words/adverbs/names, and more.

Scenario #1 #2 #3 #4 #5 Total
Detailed Writing Rules
737171696169%
817978747277%
868281757480%
828279777780%
828180777679%
828277757478%
Detailed Writing Rules77.02%
genre
757473736672%
777473727274%
787776747476%
837979787278%
857978777679%
787877777176%
genre75.83%
Novelcrafter Default Prompt
747471706871%
747473737273%
838281807681%
878079767379%
838281797881%
817873737175%
Novelcrafter Default Prompt76.61%
76.49%

Codex Extraction

Evaluates a model's ability to extract structured codex entries (characters, locations, objects, lore) from prose passages and return them as well-formed XML.

Scenario #1 #2 #3 #4 #5 Total
979696969696%
969696959596%
979796969596%
999999989898%
96.56%

Text Replacement

Tests deterministic text transformations: renaming characters/locations, expanding contractions, tense rewriting, POV shifts, gender swaps, combined transformations, and word avoidance. Scored by checking each expected change independently.

Scenario #1 #2 #3 #4 #5 #6 #7 Total
Generic Prompt
100100100100100100100100%
100100100100100100100100%
9999999999999999%
100100100100100100100100%
100100100100100100100100%
100100100100100100100100%
9493939393919193%
100100100100100100100100%
9999999999999999%
Generic Prompt98.98%
Specific Prompt
100100100100100100100100%
100100100100100100100100%
9999999999999999%
100100100100100100100100%
100100100100100100100100%
100100100100100100100100%
9493939393939293%
100100100100100100100100%
100100100100100100100100%
Specific Prompt99.11%
99.05%

Tool usage within Novelcrafter

Output messages that are related to tool usage within Novelcrafter

Scenario #1 #2 #3 #4 #5 #6 #7 #8 #9 #10 Total
100100100100100100100100100100100%
100.00%