Categories

NC Bench evaluates models across 8 categories and 23 subcategories.

Category Distribution

Shows the number of scenarios in each category. Some scenarios may be in multiple categories.

Tooling (13)
Creative Writing (18)
Language (9)
Utility (32)
Reasoning (20)
Text Editing (18)
Rule Following (12)
Hallucination (28)

Creative Writing

18 scenarios · 6 subcategories

Subcategories

81.52% AI-isms
67.93% Prose Variety
77.75% Dialogue
88.35% Purple Prose
86.16% Mechanical Style
78.22% Clichés

Tooling

13 scenarios · 1 subcategory

Subcategories

96.04% XML

Language

9 scenarios · 2 subcategories

Subcategories

83.58% Comprehension
87.20% Generation

Utility

32 scenarios · 5 subcategories

Reasoning

20 scenarios · 2 subcategories

Text Editing

18 scenarios · 3 subcategories

Subcategories

83.94% Transformation
93.68% Preservation
98.32% Structural Integrity

Rule Following

12 scenarios · 1 subcategory

Hallucination

28 scenarios · 3 subcategories