google/gemma-3-12b-it

Gemma 3 12B

Release Date

Mar 12th, 2025

Context Size

128k

Benchmark Cost

$0.10

Speed

35.0 tok/s

Creative writing

65.94%

Rule following

68.37%

Utility

74.38%

Mathematics

100.00%

Tooling

68.00%

Language

83.93%

Logic

76.57%

Bad Writing Habits

Detects common prose quality anti-patterns in AI-generated creative writing, including passive voice, past progressive overuse, weak dialogue tags, filter words, purple prose, cliches, AI-ism words/adverbs/names, and more.

Scenario #1 #2 #3 #4 #5 Total
Detailed Writing Rules
0-shot Creative writingRule following
777676757075%
0-shot Creative writingRule following
808079797178%
0-shot Creative writingRule following
818077767478%
0-shot Creative writingRule following
858281757179%
0-shot Creative writingRule following
827978757478%
0-shot Creative writingRule following
807877727075%
Detailed Writing Rules77.01%
genre
0-shot Creative writingRule following
797673737074%
0-shot Creative writingRule following
818181806979%
0-shot Creative writingRule following
787875746474%
0-shot Creative writingRule following
828180767579%
0-shot Creative writingRule following
787675746674%
0-shot Creative writingRule following
827976757577%
genre76.19%
Novelcrafter Default Prompt
0-shot Creative writingRule following
797574736774%
0-shot Creative writingRule following
807877767678%
0-shot Creative writingRule following
787573727274%
0-shot Creative writingRule following
827874727276%
0-shot Creative writingRule following
767575746774%
0-shot Creative writingRule following
828179787579%
Novelcrafter Default Prompt75.57%
76.25%

Codex Violation Detection

Detects factual inconsistencies between a story bible and prose passages. The model must output structured XML identifying each violation with paragraph number and substring.

Scenario #1 #2 #3 #4 #5 #6 #7 #8 #9 #10 Total
matrix
0-shot ToolingUtilityLogicRule following
6159555248444443404049%
0-shot ToolingUtilityLogicRule following
7269696766666463635265%
0-shot ToolingUtilityLogicRule following
7566666661616157575362%
0-shot ToolingUtilityLogicRule following
9797979797979393938895%
matrix67.72%
tiers
0-shot ToolingUtilityLogicRule following
9375757575757575757577%
0-shot ToolingUtilityLogicRule following
9286868686867979726482%
0-shot ToolingUtilityLogicRule following
656161616053525147351%
0-shot ToolingUtilityLogicRule following
5548484848404040403344%
tiers63.54%
65.63%

Dialogue tags

Various tasks related to dialogue tags in text.

Scenario #1 #2 #3 #4 #5 #6 #7 #8 #9 #10 Total
dialogue-200
0-shot Creative writingRule following
9994938383805128101063%
0-shot Creative writingRule following
498311100006%
0-shot Creative writingRule following
9364515050505050461652%
dialogue-20040.43%
dialogue-500
0-shot Creative writingRule following
505049493414100025%
0-shot Creative writingRule following
3022000000005%
0-shot Creative writingRule following
7039363427231431125%
dialogue-50018.17%
Ungrouped
0-shot Creative writingRule following
100100100100100100100100100100100%
39.40%

Language Comprehension

Does the model understand more than just English?

Scenario #1 #2 #3 #4 #5 Total
0-shot Language
000000%
0-shot Language
100100100100100100%
0-shot Language
100100100100080%
0-shot Language
100100100100100100%
70.00%

Language Writing

Can the model generate text in different languages?

Scenario #1 #2 #3 #4 #5 Total
0-shot Language
1001001001009298%
0-shot Language
10010091919094%
0-shot Language
10010089815585%
0-shot Language
1001001001008998%
0-shot Language
100100100100100100%
95.07%

N-Length Sentences

Write sentences with exactly N words

Scenario #1 #2 #3 #4 #5 #6 #7 #8 #9 #10 Total
0-shot Rule following
1001001009896969696938796%
0-shot Rule following
9592929090878785847488%
0-shot Rule following
14131093221005%
63.03%

Novel outline

Handle questions about the outline of a novel in various formats

Scenario #1 #2 #3 #4 #5 #6 #7 #8 #9 #10 Total
outline-count
0-shot ToolingUtility
100100100100100100100100100100100%
0-shot ToolingUtility
100100100100100100100100100100100%
0-shot ToolingUtility
10000000000010%
outline-count70.00%
pov-count
0-shot ToolingUtility
100100100100100505050505075%
0-shot ToolingUtility
100100100100100100100100100100100%
0-shot ToolingUtility
10000000000010%
pov-count61.67%
65.83%

Tool usage within Novelcrafter

Output messages that are related to tool usage within Novelcrafter

Scenario #1 #2 #3 #4 #5 #6 #7 #8 #9 #10 Total
0-shot ToolingUtility
100100100100100100100100100100100%
100.00%

Voice/dialogue sheets

Extract dialogue from given text as voice sheets.

Scenario #1 #2 #3 #4 #5 #6 #7 #8 #9 #10 Total
0-shot Utility
1001001001001000000050%
0-shot Utility
00000000000%
1-shot Utility
100100100100100100100100100100100%
Few-shot Utility
100100100100100100100100100100100%
0-shot Utility
10000000000010%
52.00%

Write N of X

Write exactly N words/sentences/paragraphs...

Scenario #1 #2 #3 #4 #5 #6 #7 #8 #9 #10 Total
paragraphs
0-shot Rule following
100100100100100100100100100100100%
0-shot Rule following
100100100100100100100100100100100%
0-shot Rule following
100100100100100100100100100100100%
paragraphs100.00%
sentences
0-shot Rule following
100100100100100100100100100100100%
0-shot Rule following
100100100100100100100100100100100%
0-shot Rule following
100100100100100100100100100100100%
0-shot Rule following
10010010010010010098542075%
0-shot Rule following
989854000000025%
sentences80.06%
words
0-shot Rule following
100100100100100100100100100100100%
0-shot Rule following
10010010010098989892927796%
0-shot Rule following
5427220000008%
0-shot Rule following
100100989892772792060%
0-shot Rule following
9254542700000023%
words57.44%
75.96%