google/gemini-2.5-flash

Gemini 2.5 Flash (Reasoning)

Release Date

Jun 17th, 2025

Context Size

1m

Reasoning

Yes

Benchmark Cost

$5.73

Speed

198.2 tok/s

Categories

20%40%60%80%100%Creative Writing76.3%Tooling100.0%Language86.1%Utility82.2%Reasoning93.8%Text Editing98.1%Rule Following60.0%Hallucination95.6%

Subcategories

20%40%60%80%100%AI-ismsProse VarietyDialoguePurple ProseMechanical StyleClichésXMLComprehensionGenerationWord CountingSentence CountingParagraph CountingStructural CountingData ExtractionDeductionAttentionTransformationPreservationStructural IntegrityConstraint AdherenceFalse PositivesContent InventionOutput Corruption

Bad Writing Habits

Detects common prose quality anti-patterns in AI-generated creative writing, including passive voice, past progressive overuse, weak dialogue tags, filter words, purple prose, cliches, AI-ism words/adverbs/names, and more.

Scenario #1 #2 #3 #4 #5 Total
Detailed Writing Rules
767272727273%
837978767378%
878280797681%
848078767178%
828078777578%
858380787881%
Detailed Writing Rules77.98%
genre
757168686670%
787674676472%
827572706974%
838180796978%
777573717073%
858381797781%
genre74.62%
Novelcrafter Default Prompt
757573727173%
818080757378%
787573706973%
858180777680%
757473737274%
898685837183%
Novelcrafter Default Prompt76.65%
76.42%

Codex Extraction

Evaluates a model's ability to extract structured codex entries (characters, locations, objects, lore) from prose passages and return them as well-formed XML.

Scenario #1 #2 #3 #4 #5 Total
969696949495%
949493919092%
979796886989%
1009491868391%
91.93%

Text Replacement

Tests deterministic text transformations: renaming characters/locations, expanding contractions, tense rewriting, POV shifts, gender swaps, combined transformations, and word avoidance. Scored by checking each expected change independently.

Scenario #1 #2 #3 #4 #5 #6 #7 Total
Generic Prompt
100100100100100100100100%
100100100100100100100100%
9999989898989898%
100100100100100100100100%
100100100100100100100100%
9999999999999599%
9695959594949394%
100100100100100100100100%
9696938989818189%
Generic Prompt97.85%
Specific Prompt
100100100100100100100100%
100100100100100100100100%
9999979793908995%
100100100100100100100100%
100100100100100100100100%
100100100100100100100100%
9898979797969497%
100100100100100100100100%
100100100100100100100100%
Specific Prompt99.08%
98.46%

Tool usage within Novelcrafter

Output messages that are related to tool usage within Novelcrafter

Scenario #1 #2 #3 #4 #5 #6 #7 #8 #9 #10 Total
100100100100100100100100100100100%
100.00%