NC Bench
A comprehensive benchmark for creative writing models.
Created by Novelcrafter.
This project is in early access and still work in progress.
A comprehensive benchmark for creative writing models.
Created by Novelcrafter.
NC Bench is a cutting-edge benchmark for creativity in LLMs, with focus on creative writing, instruction following, utility, tooling, and language skills.
We test how well models enhance the writing process through text manipulation, idea generation, summarization, and translation.
From generating quality prose to hallucination-free extraction, NC Bench puts AI models through their paces in all aspects of writing assistance.
Shows the number of scenarios in each category. Some scenarios may be in multiple categories.
| 88.77% | MoonshotAI: Kimi K2.5 |
| 88.14% | Claude Opus 4.6 |
| 86.94% | o4 Mini High |
| 78.16% | o4 Mini High |
| 71.27% | Claude Opus 4.6 |
| 68.57% | Claude Opus 4.5 |
| 92.07% | o4 Mini High |
| 87.60% | o4 Mini |
| 87.44% | MoonshotAI: Kimi K2.5 |
| 95.00% | Claude Sonnet 4 |
| 92.83% | Claude Opus 4.6 |
| 91.83% | Claude 3.5 Sonnet (new) |
| 100.00% | Claude 3.5 Sonnet |
| 100.00% | Claude 3.7 Sonnet |
| 100.00% | Gemini Flash 1.5 |
| 100.00% | Gemini 3 Pro (Preview) |
| 100.00% | MoonshotAI: Kimi K2.5 |
| 99.23% | Gemini 2.5 Pro |
| 98.13% | Hermes 3 405B |
| 96.28% | DeepSeek-V2 Chat |
| 95.68% | Z.AI GLM 4.5 |
| 94.38% | Phi-3 Mini 128k |
| 93.75% | GPT-4o Mini (temp=0) |
| 93.75% | Claude Sonnet 4 |