NC Bench
A comprehensive benchmark for creative writing models.
Created by Novelcrafter.
This project is in early access and still work in progress.
A comprehensive benchmark for creative writing models.
Created by Novelcrafter.
NC Bench is a cutting-edge benchmark for creativity in LLMs, with focus on creative writing, instruction following, utility, tooling, and language skills.
We test how well models enhance the writing process through text manipulation, idea generation, summarization, and translation.
From generating quality prose to hallucination-free extraction, NC Bench puts AI models through their paces in all aspects of writing assistance.
Shows the number of scenarios in each category. Some scenarios may be in multiple categories.
| 86.19% | o4 Mini High |
| 83.82% | Claude Opus 4.6 |
| 83.53% | Gemini 3 Flash (Preview) |
| 86.97% | Gemini 3 Flash (Preview) |
| 83.00% | Claude 3.7 Sonnet |
| 82.10% | Claude 3.5 Sonnet |
| 92.07% | o4 Mini High |
| 90.50% | Gemini 3 Flash (Preview) |
| 87.60% | o4 Mini |
| 95.00% | Claude Sonnet 4 |
| 92.83% | Claude Opus 4.6 |
| 91.83% | Claude 3.5 Sonnet (new) |
| 100.00% | Claude 3.5 Sonnet |
| 100.00% | Claude 3.7 Sonnet |
| 100.00% | Hermes 3 70B |
| 100.00% | Gemini 3 Pro (Preview) |
| 100.00% | MoonshotAI: Kimi K2.5 |
| 99.23% | Gemini 2.5 Pro |
| 98.13% | Hermes 3 405B |
| 96.28% | DeepSeek-V2 Chat |
| 95.68% | Z.AI GLM 4.5 |
| 93.75% | GPT-4o Mini (temp=0) |
| 93.75% | Claude Sonnet 4 |
| 93.75% | Z.AI GLM 4.6 |