google/gemma-3-4b-it

Gemma 3 4B

Release Date

Mar 12th, 2025

Context Size

128k

Benchmark Cost

$0.06

Speed

71.6 tok/s

Creative writing

56.57%

Rule following

52.02%

Utility

59.01%

Mathematics

100.00%

Tooling

42.55%

Language

72.68%

Logic

63.33%

Bad Writing Habits

Detects common prose quality anti-patterns in AI-generated creative writing, including passive voice, past progressive overuse, weak dialogue tags, filter words, purple prose, cliches, AI-ism words/adverbs/names, and more.

Scenario #1 #2 #3 #4 #5 Total
Detailed Writing Rules
0-shot Creative writingRule following
797473717174%
0-shot Creative writingRule following
797271716672%
0-shot Creative writingRule following
797371696471%
0-shot Creative writingRule following
767474747074%
0-shot Creative writingRule following
757372717072%
0-shot Creative writingRule following
838077727177%
Detailed Writing Rules73.11%
genre
0-shot Creative writingRule following
786967676770%
0-shot Creative writingRule following
777471706973%
0-shot Creative writingRule following
767571707072%
0-shot Creative writingRule following
767674727174%
0-shot Creative writingRule following
727270706369%
0-shot Creative writingRule following
808079777478%
genre72.56%
Novelcrafter Default Prompt
0-shot Creative writingRule following
787875737275%
0-shot Creative writingRule following
767573716472%
0-shot Creative writingRule following
757474716972%
0-shot Creative writingRule following
777775746774%
0-shot Creative writingRule following
747472676670%
0-shot Creative writingRule following
807675757576%
Novelcrafter Default Prompt73.38%
73.02%

Codex Violation Detection

Detects factual inconsistencies between a story bible and prose passages. The model must output structured XML identifying each violation with paragraph number and substring.

Scenario #1 #2 #3 #4 #5 #6 #7 #8 #9 #10 Total
matrix
0-shot ToolingUtilityLogicRule following
4340383729232133324%
0-shot ToolingUtilityLogicRule following
4038373635333333201732%
0-shot ToolingUtilityLogicRule following
6159544949474443424249%
0-shot ToolingUtilityLogicRule following
4844404040403939393841%
matrix36.54%
tiers
0-shot ToolingUtilityLogicRule following
5757574442424242424246%
0-shot ToolingUtilityLogicRule following
6458564949444239333347%
0-shot ToolingUtilityLogicRule following
4840383837363636353238%
0-shot ToolingUtilityLogicRule following
4240393837333333333336%
tiers41.78%
39.16%

Dialogue tags

Various tasks related to dialogue tags in text.

Scenario #1 #2 #3 #4 #5 #6 #7 #8 #9 #10 Total
dialogue-200
0-shot Creative writingRule following
5049301032100014%
0-shot Creative writingRule following
00000000000%
0-shot Creative writingRule following
5050505050505048431846%
dialogue-20020.10%
dialogue-500
0-shot Creative writingRule following
4514000000006%
0-shot Creative writingRule following
00000000000%
0-shot Creative writingRule following
00000000000%
dialogue-5001.96%
Ungrouped
0-shot Creative writingRule following
100616161141414141134%
14.29%

Language Comprehension

Does the model understand more than just English?

Scenario #1 #2 #3 #4 #5 Total
0-shot Language
1001001000060%
0-shot Language
100100100100100100%
0-shot Language
100000020%
0-shot Language
100100100100100100%
70.00%

Language Writing

Can the model generate text in different languages?

Scenario #1 #2 #3 #4 #5 Total
0-shot Language
1001001001006092%
0-shot Language
100100100928696%
0-shot Language
67605845046%
0-shot Language
1001008573071%
0-shot Language
1001008362069%
74.83%

N-Length Sentences

Write sentences with exactly N words

Scenario #1 #2 #3 #4 #5 #6 #7 #8 #9 #10 Total
0-shot Rule following
1001001009898989493897795%
0-shot Rule following
8888878077757265635275%
0-shot Rule following
1611210000003%
57.48%

Novel outline

Handle questions about the outline of a novel in various formats

Scenario #1 #2 #3 #4 #5 #6 #7 #8 #9 #10 Total
outline-count
0-shot ToolingUtility
100100100100100100100100100100100%
0-shot ToolingUtility
100100100100100100100100100100100%
0-shot ToolingUtility
00000000000%
outline-count66.67%
pov-count
0-shot ToolingUtility
505050000000015%
0-shot ToolingUtility
10000000000010%
0-shot ToolingUtility
00000000000%
pov-count8.33%
37.50%

Tool usage within Novelcrafter

Output messages that are related to tool usage within Novelcrafter

Scenario #1 #2 #3 #4 #5 #6 #7 #8 #9 #10 Total
0-shot ToolingUtility
100100100100100100100100100100100%
100.00%

Voice/dialogue sheets

Extract dialogue from given text as voice sheets.

Scenario #1 #2 #3 #4 #5 #6 #7 #8 #9 #10 Total
0-shot Utility
00000000000%
0-shot Utility
10010010010000000040%
1-shot Utility
10000000000010%
Few-shot Utility
100100100100100100100100100100100%
0-shot Utility
00000000000%
30.00%

Write N of X

Write exactly N words/sentences/paragraphs...

Scenario #1 #2 #3 #4 #5 #6 #7 #8 #9 #10 Total
paragraphs
0-shot Rule following
100100100100100100100100100100100%
0-shot Rule following
100100100100100100000060%
0-shot Rule following
00000000000%
paragraphs53.33%
sentences
0-shot Rule following
100100100100100100100100100100100%
0-shot Rule following
100100100100100100100100100100100%
0-shot Rule following
10010010010010010010010010098100%
0-shot Rule following
100100100100100100100100100100100%
0-shot Rule following
10010010010010010092549075%
sentences95.04%
words
0-shot Rule following
5427200000008%
0-shot Rule following
00000000000%
0-shot Rule following
00000000000%
0-shot Rule following
00000000000%
0-shot Rule following
540000000005%
words2.72%
49.91%