NC Bench
A comprehensive benchmark for creative writing models.
Created by Novelcrafter.
This project is in early access and still work in progress.
A comprehensive benchmark for creative writing models.
Created by Novelcrafter.
NC Bench is a cutting-edge benchmark for creativity in LLMs, with focus on creative writing, instruction following, utility, tooling, and language skills.
We test how well models enhance the writing process through text manipulation, idea generation, summarization, and translation.
From generating quality prose to hallucination-free extraction, NC Bench puts AI models through their paces in all aspects of writing assistance.
Shows the number of scenarios in each category. Some scenarios may be in multiple categories.
| 91.77% | GPT-5 Mini |
| 90.73% | Qwen 3.5 397B A17B |
| 89.90% | MoonshotAI: Kimi K2.5 |
| 83.95% | GPT-5 |
| 81.05% | GPT-5 Mini |
| 78.16% | o4 Mini High |
| 93.82% | GPT-5 |
| 93.20% | GPT-5 Mini |
| 93.09% | o4 Mini High |
| 95.67% | Claude Sonnet 4 |
| 92.91% | Qwen 3.5 397B A17B |
| 92.88% | Claude Opus 4.6 |
| 100.00% | Claude 3.5 Sonnet |
| 100.00% | Claude 3.7 Sonnet |
| 100.00% | Hermes 3 70B |
| 99.01% | Gemini 2.5 Pro |
| 98.34% | MoonshotAI: Kimi K2.5 |
| 98.18% | Gemini 3 Pro (Preview) |
| 98.13% | Hermes 3 405B |
| 96.28% | DeepSeek-V2 Chat |
| 95.68% | Z.AI GLM 4.5 |
| 95.32% | Gemini 2.5 Pro |
| 95.08% | Claude Sonnet 4 |
| 93.55% | Z.AI GLM 4.6 |