GPT-5.2 - NC Bench

openai/gpt-5.2

GPT-5.2

via OpenRouter

Release Date

Dec 10th, 2025

Parameters

–

Context Size

400k

Benchmark Cost

$6.16

Speed

46.4 tok/s

Creative writing

57.89%

Rule following

88.88%

Utility

85.55%

Mathematics

100.00%

Tooling

90.50%

Language

80.93%

Logic

92.35%

Codex Violation Detection

Detects factual inconsistencies between a story bible and prose passages. The model must output structured XML identifying each violation with paragraph number and substring.

Scenario	Run 1	Run 2	Run 3	Run 4	Run 5	Run 6	Run 7	Run 8	Run 9	Run 10	Total
matrix
Large codex (40 entries), long passage (1,019 words) 0-shot ToolingUtilityLogicRule following	94%	94%	93%	93%	92%	92%	91%	91%	90%	90%	92%
Large codex (40 entries), short passage (165 words) 0-shot ToolingUtilityLogicRule following	97%	97%	96%	95%	95%	95%	93%	93%	93%	91%	95%
Small codex (7 entries), long passage (734 words) 0-shot ToolingUtilityLogicRule following	95%	90%	90%	90%	90%	90%	90%	90%	90%	90%	91%
Small codex (7 entries), short passage (165 words) 0-shot ToolingUtilityLogicRule following	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%
tiers
5 codex entries 0-shot ToolingUtilityLogicRule following	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%
10 codex entries 0-shot ToolingUtilityLogicRule following	100%	100%	100%	100%	100%	100%	100%	100%	100%	92%	99%
20 codex entries 0-shot ToolingUtilityLogicRule following	97%	97%	96%	96%	96%	96%	92%	92%	92%	92%	95%
40 codex entries 0-shot ToolingUtilityLogicRule following	100%	100%	100%	95%	95%	95%	95%	95%	95%	91%	96%
95.94%

Data extraction

Extract key details from a given block of text.

Scenario	Run 1	Run 2	Run 3	Run 4	Run 5	Run 6	Run 7	Run 8	Run 9	Run 10	Total
All valid emails 0-shot Utility	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%
Contextual pronoun 0-shot Utility	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%
Fruits excluding citrus 0-shot Utility	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%
Future event time 0-shot UtilityLogic	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%
Guess the pet 0-shot UtilityLogic	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%
Highest-rated movie 0-shot Utility	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%
Indirect birth year 0-shot UtilityMathematicsLogic	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%
What instrument does Lucy play? 0-shot UtilityLogic	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%
What's the color of the car? 0-shot UtilityLogic	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%
What's the correct time? 0-shot UtilityLogic	100%	0%	0%	0%	0%	0%	0%	0%	0%	0%	10%
Who's the sister? 0-shot UtilityLogic	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%
Who's the tallest? 0-shot UtilityLogic	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%
92.50%

Dialogue tags

Various tasks related to dialogue tags in text.

Scenario	Run 1	Run 2	Run 3	Run 4	Run 5	Run 6	Run 7	Run 8	Run 9	Run 10	Total
dialogue-200
Write 200 words with 10% dialogue 0-shot Creative writingRule following	100%	100%	100%	100%	100%	100%	100%	100%	99%	98%	100%
Write 200 words with 50% dialogue 0-shot Creative writingRule following	100%	100%	100%	100%	100%	100%	99%	82%	71%	50%	90%
Write 200 words with 90% dialogue 0-shot Creative writingRule following	99%	99%	98%	98%	98%	93%	89%	83%	68%	68%	89%
dialogue-500
Write 500 words with 30% dialogue 0-shot Creative writingRule following	45%	44%	17%	0%	0%	0%	0%	0%	0%	0%	11%
Write 500 words with 50% dialogue 0-shot Creative writingRule following	99%	37%	17%	2%	0%	0%	0%	0%	0%	0%	16%
Write 500 words with 70% dialogue 0-shot Creative writingRule following	46%	45%	43%	41%	27%	5%	3%	2%	0%	0%	21%
Ungrouped
Write unattributed dialogue 0-shot Creative writingRule following	100%	100%	100%	100%	100%	100%	100%	61%	14%	14%	79%
57.89%

Language Comprehension

Does the model understand more than just English?

Scenario	Run 1	Run 2	Run 3	Run 4	Run 5	Total
Asking for directions (Dutch) 0-shot Language	100%	100%	100%	100%	100%	100%
Asking for directions (German) 0-shot Language	100%	100%	100%	100%	0%	80%
Friend got new kittens (German) 0-shot Language	100%	0%	0%	0%	0%	20%
Friend got new kittens (Tagalog) 0-shot Language	100%	100%	100%	100%	100%	100%
75.00%

Language Writing

Can the model generate text in different languages?

Scenario	Run 1	Run 2	Run 3	Run 4	Run 5	Total
Character dialogue (French) in a story 0-shot Language	93%	91%	88%	88%	81%	88%
Character dialogue (German) in a story 0-shot Language	86%	85%	84%	82%	77%	83%
Character dialogue (Hindi) in a story 0-shot Language	100%	96%	96%	88%	51%	86%
Character dialogue (Italian) in a story 0-shot Language	91%	90%	88%	87%	84%	88%
Character dialogue (Spanish) in a story 0-shot Language	86%	84%	82%	81%	80%	83%
85.67%

N-Length Sentences

Write sentences with exactly N words

Scenario	Run 1	Run 2	Run 3	Run 4	Run 5	Run 6	Run 7	Run 8	Run 9	Run 10	Total
Write sentences with 5 words each 0-shot Rule following	100%	100%	100%	100%	100%	100%	100%	100%	100%	97%	100%
Write sentences with 10 words each 0-shot Rule following	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%
Write sentences with 20 words each 0-shot Rule following	100%	100%	100%	100%	100%	100%	100%	100%	100%	95%	100%
99.73%

Novel outline

Handle questions about the outline of a novel in various formats

Scenario	Run 1	Run 2	Run 3	Run 4	Run 5	Run 6	Run 7	Run 8	Run 9	Run 10	Total
outline-count
Count acts 0-shot ToolingUtility	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%
Count chapters 0-shot ToolingUtility	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%
Count scenes 0-shot ToolingUtility	100%	100%	100%	100%	100%	100%	100%	0%	0%	0%	70%
pov-count
Count point of views for Jack and Olivia 0-shot ToolingUtility	100%	100%	0%	0%	0%	0%	0%	0%	0%	0%	20%
Count point of views for Jack Harper 0-shot ToolingUtility	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%
Count point of views for Olivia 0-shot ToolingUtility	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%
81.67%

Tool usage within Novelcrafter

Output messages that are related to tool usage within Novelcrafter

Scenario	Run 1	Run 2	Run 3	Run 4	Run 5	Run 6	Run 7	Run 8	Run 9	Run 10	Total
Create alternate prose sections 0-shot ToolingUtility	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%
100.00%

Voice/dialogue sheets

Extract dialogue from given text as voice sheets.

Scenario	Run 1	Run 2	Run 3	Run 4	Run 5	Run 6	Run 7	Run 8	Run 9	Run 10	Total
Multiple speakers 0-shot Utility	100%	0%	0%	0%	0%	0%	0%	0%	0%	0%	10%
Simple 0-shot Utility	100%	100%	100%	100%	0%	0%	0%	0%	0%	0%	40%
Simple (1-shot) 1-shot Utility	100%	100%	100%	100%	100%	100%	100%	100%	0%	0%	80%
Simple (5-shot) Few-shot Utility	100%	100%	100%	100%	100%	100%	100%	100%	100%	0%	90%
Unattributed dialogue 0-shot Utility	100%	100%	100%	100%	100%	0%	0%	0%	0%	0%	50%
54.00%

Write N of X

Write exactly N words/sentences/paragraphs...

Scenario	Run 1	Run 2	Run 3	Run 4	Run 5	Run 6	Run 7	Run 8	Run 9	Run 10	Total
paragraphs
1 paragraph summary 0-shot Rule following	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%
3 paragraph summary 0-shot Rule following	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%
5 paragraph summary 0-shot Rule following	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%
sentences
1 sentence summary 0-shot Rule following	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%
3 sentence summary 0-shot Rule following	100%	100%	100%	100%	100%	100%	100%	100%	98%	98%	100%
10 sentence summary 0-shot Rule following	100%	100%	100%	100%	100%	100%	100%	100%	100%	98%	100%
20 sentence summary 0-shot Rule following	100%	100%	100%	100%	100%	100%	98%	98%	98%	98%	99%
50 sentence summary 0-shot Rule following	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%
words
10 word summary 0-shot Rule following	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%
20 word summary 0-shot Rule following	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%	100%
50 word summary 0-shot Rule following	100%	100%	100%	100%	100%	100%	100%	100%	98%	98%	100%
100 word summary 0-shot Rule following	100%	100%	100%	98%	98%	98%	98%	98%	98%	92%	98%
200 word summary 0-shot Rule following	100%	100%	98%	98%	98%	98%	92%	92%	77%	9%	86%
98.70%