google/gemma-3-4b-it

Gemma 3 4B

Release Date

Mar 12th, 2025

Context Size

128k

Reasoning

No

Benchmark Cost

$0.10

Speed

76.2 tok/s

Categories

20%40%60%80%100%Creative Writing72.1%Tooling97.9%Language72.3%Utility60.3%Reasoning73.6%Text Editing78.4%Rule Following26.4%Hallucination67.6%

Subcategories

20%40%60%80%100%AI-ismsProse VarietyDialoguePurple ProseMechanical StyleClichésXMLComprehensionGenerationWord CountingSentence CountingParagraph CountingStructural CountingData ExtractionDeductionAttentionTransformationPreservationStructural IntegrityConstraint AdherenceFalse PositivesContent InventionOutput Corruption

Bad Writing Habits

Detects common prose quality anti-patterns in AI-generated creative writing, including passive voice, past progressive overuse, weak dialogue tags, filter words, purple prose, cliches, AI-ism words/adverbs/names, and more.

Scenario #1 #2 #3 #4 #5 Total
Detailed Writing Rules
797471717173%
797271696671%
787371696471%
767474737073%
747372707072%
818077717076%
Detailed Writing Rules72.79%
genre
786967676770%
777271706872%
767571706872%
767574727073%
727270706369%
808079767478%
genre72.28%
Novelcrafter Default Prompt
787775737275%
767573716472%
757474716972%
777775746774%
747472676670%
807675757576%
Novelcrafter Default Prompt73.35%
72.81%

Codex Extraction

Evaluates a model's ability to extract structured codex entries (characters, locations, objects, lore) from prose passages and return them as well-formed XML.

Scenario #1 #2 #3 #4 #5 Total
898786868487%
838281807981%
807978787678%
909088868588%
83.33%

Text Replacement

Tests deterministic text transformations: renaming characters/locations, expanding contractions, tense rewriting, POV shifts, gender swaps, combined transformations, and word avoidance. Scored by checking each expected change independently.

Scenario #1 #2 #3 #4 #5 #6 #7 Total
Generic Prompt
10097979494928694%
100100100100100100100100%
7070707067676769%
8978787878777679%
9696969696969696%
8989898989898989%
8079787676767477%
9797959595959395%
6767676767676767%
Generic Prompt85.09%
Specific Prompt
10010010010097949298%
100100100100100100100100%
9896969696969596%
7272727272727272%
100100100100100100100100%
100100100100100100100100%
8080808080808080%
100100100100100100100100%
9797979793939395%
Specific Prompt93.45%
89.27%

Tool usage within Novelcrafter

Output messages that are related to tool usage within Novelcrafter

Scenario #1 #2 #3 #4 #5 #6 #7 #8 #9 #10 Total
100100100100100100100100100100100%
100.00%