NC Bench
Overview
About
Tests
Models
Vendors
Compare
Run: 2026-02-25-01KJ95M36HD7E3VYR9NSQVXYZ7
Run Details
Model:
Gemini 2.5 Flash (Reasoning)
Test:
Bad Writing Habits
Scenario:
Mystery: examining a crime scene
Duration:
16.6s
Cost:
$0.008319
Provider:
Google AI Studio
Evaluation summary
Score
Evaluator
Details Link
100.00%
Adverbs in dialogue tags
(details)
88.02%
AI-ism adverb frequency
(details)
100.00%
AI-ism character names
(details)
100.00%
AI-ism location names
(details)
32.11%
AI-ism word frequency
(details)
100.00%
Cliché density
(details)
100.00%
Emotion telling (show vs. tell)
(details)
100.00%
Filter word density
(details)
100.00%
Gibberish response detection
(details)
100.00%
Markdown formatting overuse
(details)
100.00%
Missing dialogue indicators (quotation marks)
(details)
66.67%
Name drop frequency
(details)
70.63%
Narrator intent-glossing
(details)
100.00%
"Not X but Y" pattern overuse
(details)
100.00%
Overuse of "that" (subordinate clause padding)
(details)
100.00%
Paragraph length variance
(details)
96.91%
Passive voice overuse
(details)
100.00%
Past progressive (was/were + -ing) overuse
(details)
100.00%
Em-dash & semicolon overuse
(details)
94.00%
Purple prose (modifier overload)
(details)
100.00%
Repeated phrase echo
(details)
100.00%
Sentence length variance
(details)
62.93%
Sentence opener variety
(details)
0.00%
Adverb-first sentence starts
(details)
83.29%
Pronoun-first sentence starts
(details)
48.61%
Subject-first sentence starts
(details)
0.00%
Subordinate conjunction sentence starts
(details)
100.00%
Technical jargon density
(details)
93.75%
Useless dialogue additions
(details)
100.00%
Dialogue tag variety (said vs. fancy)
(details)
84.5637%
Overall
Model input
System
You are an expert fiction writer. Always keep the following rules in mind: - Write in past tense and use General English spelling, grammar, and colloquialisms/slang. - Write in active voice - Always follow the "show, don't tell" principle. - Avoid adverbs and cliches and overused/commonly used phrases. Aim for fresh and original descriptions. - Convey events and story through dialogue. - Mix short, punchy sentences with long, descriptive ones. Drop fill words to add variety. - Skip "he/she said said" dialogue tags and convey people's actions or face expressions through their speech - Avoid mushy dialog and descriptions, have dialogue always continue the action, never stall or add unnecessary fluff. Vary the descriptions to not repeat yourself. - Put dialogue on its own paragraph to separate scene and action. - Reduce indicators of uncertainty like "trying" or "maybe" When writing text: - NEVER conclude the scene on your own, follow the beat instructions very closely. - NEVER end with foreshadowing. - NEVER write further than what I prompt you with. - AVOID imagining possible endings, NEVER deviate from the instructions. - STOP EARLY if the continuation contains what was required in the instructions. You do not need to fill out the full amount of words possible.
User/Human
Take into account the following glossary of characters/locations/items/lore... when writing your response: <codex> <character id="4416" name="Detective Harlow Quinn" aliases="Quinn, Detective Quinn, Harlow" pronouns="she/her" species="Human" storyRole="Antagonist"> <eyeColor>brown</eyeColor> <hair>closely cropped salt-and-pepper</hair> <height>5'9"</height> <age>41</age> <jawline>sharp jaw</jawline> <watch>worn leather watch on her left wrist</watch> <bearing>military precision</bearing> <description>A Metropolitan Police detective with 18 years of decorated service who suspects the clique is involved in criminal activity.</description> <backstory>She lost her partner, DS Morris, during a case three years ago under unexplained circumstances that had supernatural origins she doesn't yet understand.</backstory> </character> <character id="4414" name="Eva Kowalski" aliases="Eva, Evie" pronouns="she/her" species="Human" storyRole="Best Friend"> <eyeColor>green</eyeColor> <hair>curly red</hair> <height>5'4"</height> <age>26</age> <glasses>round glasses</glasses> <satchel>worn leather satchel full of books</satchel> <freckles>freckled complexion</freckles> <nervousHabit>tucking hair behind her left ear</nervousHabit> <description>Aurora's childhood friend and an occult researcher.</description> <backstory>Moved to London two years before Aurora and works as a research assistant at the British Museum's restricted archives. She holds a master's degree in Ancient History from Oxford.</backstory> </character> <location id="856" name="The Veil Market" aliases="Veil Market, the Market, underground market" realm="Earth"> <currentLocation>abandoned Tube station beneath Camden</currentLocation> <entryRequirement>bone token</entryRequirement> <schedule>moves locations every full moon</schedule> <description>A hidden supernatural black market that sells enchanted goods, banned alchemical substances, and information.</description> </location> <item id="2004" name="Veil Compass" aliases="the Compass, shadow compass" type="Tool" origin="The Veil Market"> <casingMaterial>small brass compass</casingMaterial> <faceMarkings>face etched with protective sigils</faceMarkings> <needle>points toward the nearest supernatural rift or portal</needle> <patina>casing has a patina of verdigris</patina> <creator>crafted by a Shade artisan</creator> <description>A compass attuned to supernatural energy.</description> </item> </codex> Continue the story using the following instructions: <instructions> Write a fully fleshed out scene of at least 1000 words in the mystery genre. A detective arrives at a crime scene in an unusual location. Something is wrong — the evidence doesn't add up. A colleague is present, offering a different interpretation. Through careful observation and deduction, the detective begins to see what others have missed. Use the codex entries above as your world reference for characters and setting. Only include the prose of the scene without any markdown or commentary. </instructions>
Model response
Messages
Meta
AI
The Tube’s low rumble vibrated
through the cracked concrete beneath Harlow’s boots, a phantom tremor from the active lines above.
She navigated the gloom, her
torch beam cutting through the dust-mote-thick air, past a skeletal turnstile, then down a derelict escalator coated in a century of grime.
The air hung heavy with
the scent of damp earth, metallic decay, and something acrid, almost floral, that clung to the back of her throat. “Quinn, you made it.”
DS Miller stood beside yellow
crime scene tape, his broad shoulders filling the narrow tunnel.
His breath plumed in the
cold, his face tight. Beyond him, the platform stretched into darkness, its advertising panels long gone, exposing pitted brickwork.
A single portable floodlight cast
harsh shadows, illuminating a
macabre
tableau on the tracks. “Give me a briefing, Miller,” Harlow’s voice, a low rasp,
echoed
back. She stepped under the tape, her worn leather
watch
glinting
on her left wrist. “Vic’s name is Alistair Vance, known for petty theft, low-level dealing. Found him face down, wedged between the third rail and the platform edge.”
Miller gestured towards the body
. “Looks like a street fight, probably went south. Stab wounds, blunt force trauma to the head. Standard fare for this part of town, especially in these… tourist traps.” He kicked at a discarded, intricately carved bone. Harlow didn't acknowledge the kick. Her gaze swept the expanse, taking in the scene.
The chill in this abandoned
station felt deeper than the London winter. Makeshift stalls, their canvas flaps ripped, slumped against the tunnel walls. Scattered rubbish – unusual trinkets, dried herbs, iridescent powders – spoke of a recent,
chaotic
exodus. This was the Veil Market, all right.
Another full moon, another hidden
bazaar. And now, a body.
She moved towards Vance, her
gait precise, a military kind of rhythm.
A uniformed officer knelt by
the body,
carefully
bagging evidence.
The floodlight illuminated Vance’s still
form, his long, dark coat torn, revealing a tattered shirt underneath.
Dark, congealed blood stained the
grey ballast.
Harlow knelt, ignoring the damp
chill
seeping through her trousers. “Hands, Miller,” Harlow said, her eyes fixed on the body. “No defensive wounds, if that's what you're asking,” Miller offered, reading her mind. “But he’s been in fights before. Vance usually ran, or he took a beating. He wasn’t one for putting up much of a struggle.” Harlow
traced
a line with her gloved finger along the track bed. Her brown
eyes narrowed
. A faint, almost imperceptible shimmer coated the rail beside Vance.
She angled her head, catching
the light. “What’s the initial cause of death?” “Multiple stab wounds to the torso, sharp object. Possibly a knife. ME thinks he bled out fast. The head trauma looks post-mortem, or happened concurrently.” Miller paused. “Looks like he didn’t know what hit him. Or who.”
She stood, sweeping her gaze
along the immediate area around Vance. The ballast around the body
appeared pristine
. No displaced gravel. No scuff marks from heavy boots. Not a single stone had shifted. She walked around the body, her focus minute, searching for scuffs, for signs of struggle beyond the wounds on Vance’s body. There were none. Vance might have taken a beating, but something was wrong here. The scene told a story of violence, yet lacked its usual punctuation. “The ME is pretty clear on the weapon. A knife, a blade of some sort,” Miller reiterated, rubbing his hands together. “We're looking for a common street blade, probably.” Harlow approached the wall directly opposite the body. Near eye level, where a struggle might have left a mark, the brickwork bore three parallel gashes.
They ran vertically, deep gouges
in the aged mortar, as if something with powerful claws scraped against it. But the edges, under her careful inspection, were too clean. Too sharp. Not jagged or splintered like stone against metal or bone. “These marks,” Harlow pointed with her chin. “What do you make of these?” Miller peered closer, tilting his head. “Probably from Vance getting pushed into the wall, maybe a buckle on his jacket. Or the attacker, trying to get leverage.”
He shrugged, dismissing it
. “It’s a rough-and-tumble spot. Nothing unusual.”
She ran a finger over
the deepest groove. The edges felt polished.
She smelled it, a faint
trace of that acrid-floral scent that still lingered. She moved back toward Vance, her eyes tracking an invisible path. Vance’s fingers
were splayed
, resting
lightly
on the gravel, palms up. His coat
was ripped
, true, but the tears looked almost surgical, clean-edged, not ragged from a sustained fight. There was barely any debris caught in the fabric. “No evidence of a robbery?” “Wallet was still on him, small amount of cash. Watch, cheap one, still on his wrist. Nothing valuable missing. Unless they took something less…
tangible
.” Miller’s voice held a trace of sarcasm, a subtle jab at the weirdness of the former market. “No, they took something
tangible
,” Harlow mused, her gaze fixed on Vance’s left hand.
His index finger bore a
single, thin band of white.
Not skin, but something skeletal
. A bone token.
It was common currency among
Veil Market patrons, an admission pass, a ward against certain predatory elements. But this token seemed different.
It was fractured, splintered precisely
down the middle, as if snapped by immense, focused force.
She remembered DS Morris, the
unexplainable circumstances of his death. The way the scene hadn’t added up, not with conventional explanations. The way her gut had screamed something else. Here, that same unease bloomed. “Miller, look at this,” Harlow called out, kneeling again by Vance’s hand. She pointed to the
fractured
bone token. “The split. It’s too clean. And the hand itself. He wasn’t clutching it. It’s just… there.” Miller knelt beside her, squinting. “Bone token. Means he was here for the market. Someone probably grabbed his hand, twisted for a quick robbery, splintered the token by accident.” “He has no defensive wounds. No bruises on his wrists or forearms from a
grapple
. His hands aren't clenched. There are no skin abrasions or tears around the token. And if someone grabbed his hand, how does the token break so cleanly down the middle, lengthwise, without damaging the finger beneath or twisting his hand?” Harlow’s voice remained level, but the question hung sharp in the air. She picked up a small, glittering shard of what looked like obsidian, a few inches from Vance’s head. It reflected the floodlight with an unnatural gleam. Not rock. Not glass. It felt unnervingly light in her gloved hand. She remembered the acrid-floral smell, stronger now close to the body, almost like burnt sugar. “This,” Harlow said, holding out the shard. “Did CSU bag this?” Miller shrugged. “Probably. Looks like debris from the market. Could be anything.” “No,” Harlow contradicted, her gaze moving from the shard to the deep, precise gashes on Vance’s coat. “Look at the tears in his jacket. And then look at the gashes on the wall. The depth, the width, the edges.” She held the shard up. “This isn’t a mugging, Miller. This wasn’t a street fight that got out of hand. These wounds, this broken token, these marks, all tell a different story.” She put the shard back down, her fingers brushing the cold, damp ballast. “Vance wasn’t killed with a knife. He was killed by something with claws. Something that could exert immense, controlled pressure. Something that broke bone cleanly, without a struggle. Something that left no scuff marks, no displaced gravel, because it didn’t fight him. It just… tore him apart.”
created_at
"2026-02-25T01:11:41.017Z"
2/25/2026, 1:11:41 AM
model
"google/gemini-2.5-flash"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
1267
moderation_latency
(null)
generation_time
16547
tokens_prompt
1149
tokens_completion
2969
native_tokens_prompt
1073
native_tokens_completion
3199
native_tokens_completion_images
0
native_tokens_reasoning
1472
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"STOP"
usage
0.0083194
router
(null)
provider_responses
0
endpoint_id
"16c01e26-9b91-4d22-b487-ed35634b017d"
is_byok
false
latency
1267
model_permaslug
"google/gemini-2.5-flash"
provider_name
"Google AI Studio"
status
200
api_type
"completions"
id
"gen-1771981901-ezrnAfQ9oPucFHKCfiNM"
upstream_id
(null)
total_cost
0.0083194
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Google AI Studio"
Evaluation details
Result
Evaluator
Details
Meta Data
100.00%
Adverbs in dialogue tags
Target: ≤10% dialogue tags with adverbs
totalTags
16
adverbTagCount
1
adverbTags
0
"a low rasp echoed back [back]"
dialogueSentences
30
tagDensity
0.533
leniency
1
rawRatio
0.063
effectiveRatio
0.063
88.02%
AI-ism adverb frequency
Target: <2% AI-ism adverbs (58 tracked)
wordCount
1252
totalAiIsmAdverbs
3
found
0
adverb
"carefully"
count
1
1
adverb
"lightly"
count
1
2
adverb
"precisely"
count
1
highlights
0
"carefully"
1
"lightly"
2
"precisely"
100.00%
AI-ism character names
Target: 0 AI-default names (17 tracked, −20% each)
codexExemptions
(empty)
found
(empty)
100.00%
AI-ism location names
Target: 0 AI-default location names (33 tracked, −20% each)
codexExemptions
(empty)
found
(empty)
32.11%
AI-ism word frequency
Target: <2% AI-ism words (290 tracked)
wordCount
1252
totalAiIsms
17
found
0
word
"vibrated"
count
1
1
word
"navigated"
count
1
2
word
"gloom"
count
1
3
word
"macabre"
count
1
4
word
"echoed"
count
1
5
word
"glinting"
count
1
6
word
"standard"
count
1
7
word
"chill"
count
2
8
word
"chaotic"
count
1
9
word
"traced"
count
1
10
word
"pristine"
count
1
11
word
"tangible"
count
2
12
word
"fractured"
count
2
13
word
"grapple"
count
1
highlights
0
"vibrated"
1
"navigated"
2
"gloom"
3
"macabre"
4
"echoed"
5
"glinting"
6
"standard"
7
"chill"
8
"chaotic"
9
"traced"
10
"pristine"
11
"tangible"
12
"fractured"
13
"grapple"
100.00%
Cliché density
Target: ≤1 cliche(s) per 800-word window
totalCliches
1
maxInWindow
1
found
0
label
"eyes widened/narrowed"
count
1
highlights
0
"eyes narrowed"
100.00%
Emotion telling (show vs. tell)
Target: ≤3% sentences with emotion telling
emotionTells
0
narrationSentences
84
matches
(empty)
100.00%
Filter word density
Target: ≤3% sentences with filter/hedge words
filterCount
1
hedgeCount
0
narrationSentences
84
filterMatches
0
"watch"
hedgeMatches
(empty)
100.00%
Gibberish response detection
Target: ≤1% gibberish-like sentences (hard fail if a sentence exceeds 800 words)
analyzedSentences
98
gibberishSentences
0
adjustedGibberishSentences
0
longSentenceCount
0
runOnParagraphCount
0
giantParagraphCount
0
wordSaladCount
0
repetitionLoopCount
0
controlTokenCount
0
maxSentenceWordsSeen
67
ratio
0
matches
(empty)
100.00%
Markdown formatting overuse
Target: ≤5% words in markdown formatting
markdownSpans
0
markdownWords
0
totalWords
1254
ratio
0
matches
(empty)
100.00%
Missing dialogue indicators (quotation marks)
Target: ≤10% speech attributions without quotation marks
totalAttributions
7
unquotedAttributions
0
matches
(empty)
66.67%
Name drop frequency
Target: ≤1.0 per-name mentions per 100 words
totalMentions
41
wordCount
814
uniqueNames
8
maxNameDensity
1.6
worstName
"Harlow"
maxWindowNameDensity
3
worstWindowName
"Vance"
discoveredNames
Tube
1
Harlow
13
Miller
9
London
1
Veil
2
Market
2
Vance
12
Morris
1
persons
0
"Tube"
1
"Harlow"
2
"Miller"
3
"Vance"
4
"Morris"
places
0
"London"
globalScore
0.701
windowScore
0.667
70.63%
Narrator intent-glossing
Target: ≤2% narration sentences with intent-glossing patterns
analyzedSentences
63
glossingSentenceCount
2
matches
0
"appeared pristine"
1
"looked like obsidian, a few inches from V"
100.00%
"Not X but Y" pattern overuse
Target: ≤1 "not X but Y" per 1000 words
totalMatches
1
per1kWords
0.797
wordCount
1254
matches
0
"Not skin, but something skeletal"
100.00%
Overuse of "that" (subordinate clause padding)
Target: ≤2% sentences with "that" clauses
thatCount
0
totalSentences
98
matches
(empty)
100.00%
Paragraph length variance
Target: CV ≥0.5 for paragraph word counts
totalParagraphs
30
mean
41.8
std
23.69
cv
0.567
sampleLengths
0
72
1
4
2
53
3
26
4
65
5
70
6
55
7
10
8
37
9
33
10
6
11
37
12
80
13
29
14
65
15
13
16
37
17
74
18
5
19
42
20
69
21
34
22
35
23
29
24
67
25
53
26
11
27
12
28
71
29
60
96.91%
Passive voice overuse
Target: ≤2% passive sentences
passiveCount
2
totalSentences
84
matches
0
"were splayed"
1
"was ripped"
100.00%
Past progressive (was/were + -ing) overuse
Target: ≤2% past progressive verbs
pastProgressiveCount
0
totalVerbs
133
matches
(empty)
100.00%
Em-dash & semicolon overuse
Target: ≤2% sentences with em-dashes/semicolons
emDashCount
2
semicolonCount
0
flaggedSentences
1
totalSentences
98
ratio
0.01
matches
0
"Scattered rubbish – unusual trinkets, dried herbs, iridescent powders – spoke of a recent, chaotic exodus."
94.00%
Purple prose (modifier overload)
Target: <4% adverbs, <2% -ly adverbs, no adj stacking
wordCount
817
adjectiveStacks
1
stackExamples
0
"against certain predatory elements."
adverbCount
23
adverbRatio
0.028151774785801713
lyAdverbCount
8
lyAdverbRatio
0.009791921664626682
100.00%
Repeated phrase echo
Target: ≤20% sentences with echoes (window: 2)
totalSentences
98
echoCount
0
echoWords
(empty)
100.00%
Sentence length variance
Target: CV ≥0.4 for sentence word counts
totalSentences
98
mean
12.8
std
10.02
cv
0.783
sampleLengths
0
20
1
27
2
25
3
4
4
15
5
9
6
15
7
14
8
12
9
14
10
29
11
36
12
5
13
9
14
12
15
11
16
16
17
7
18
6
19
4
20
12
21
10
22
16
23
7
24
10
25
10
26
13
27
24
28
12
29
4
30
10
31
7
32
6
33
27
34
10
35
11
36
7
37
3
38
6
39
6
40
21
41
3
42
11
43
12
44
21
45
8
46
8
47
17
48
18
49
10
62.93%
Sentence opener variety
Target: ≥60% unique sentence openers
consecutiveRepeats
5
diversityRatio
0.40816326530612246
totalSentences
98
uniqueOpeners
40
0.00%
Adverb-first sentence starts
Target: ≥3% sentences starting with an adverb
adverbCount
0
totalSentences
79
matches
(empty)
ratio
0
83.29%
Pronoun-first sentence starts
Target: ≤30% sentences starting with a pronoun
pronounCount
27
totalSentences
79
matches
0
"She navigated the gloom, her"
1
"His breath plumed in the"
2
"She stepped under the tape,"
3
"He kicked at a discarded,"
4
"Her gaze swept the expanse,"
5
"She moved towards Vance, her"
6
"Her brown eyes narrowed."
7
"She angled her head, catching"
8
"She stood, sweeping her gaze"
9
"She walked around the body,"
10
"They ran vertically, deep gouges"
11
"He shrugged, dismissing it"
12
"She ran a finger over"
13
"She smelled it, a faint"
14
"She moved back toward Vance,"
15
"His coat was ripped, true,"
16
"His index finger bore a"
17
"It was common currency among"
18
"It was fractured, splintered precisely"
19
"She remembered DS Morris, the"
ratio
0.342
48.61%
Subject-first sentence starts
Target: ≤72% sentences starting with a subject
subjectCount
65
totalSentences
79
matches
0
"The Tube’s low rumble vibrated"
1
"She navigated the gloom, her"
2
"The air hung heavy with"
3
"DS Miller stood beside yellow"
4
"His breath plumed in the"
5
"A single portable floodlight cast"
6
"Harlow’s voice, a low rasp,"
7
"She stepped under the tape,"
8
"Miller gestured towards the body"
9
"He kicked at a discarded,"
10
"Harlow didn't acknowledge the kick."
11
"Her gaze swept the expanse,"
12
"The chill in this abandoned"
13
"This was the Veil Market,"
14
"Another full moon, another hidden"
15
"She moved towards Vance, her"
16
"A uniformed officer knelt by"
17
"The floodlight illuminated Vance’s still"
18
"Dark, congealed blood stained the"
19
"Harlow knelt, ignoring the damp"
ratio
0.823
0.00%
Subordinate conjunction sentence starts
Target: ≥2% sentences starting with a subordinating conjunction
subConjCount
0
totalSentences
79
matches
(empty)
ratio
0
100.00%
Technical jargon density
Target: ≤6% sentences with technical-jargon patterns
analyzedSentences
36
technicalSentenceCount
2
matches
0
"They ran vertically, deep gouges in the aged mortar, as if something with powerful claws scraped against it."
1
"It was fractured, splintered precisely down the middle, as if snapped by immense, focused force."
93.75%
Useless dialogue additions
Target: ≤5% dialogue tags with trailing filler fragments
totalTags
16
uselessAdditionCount
1
matches
0
"Harlow’s voice remained, but the question hung sharp in the air"
100.00%
Dialogue tag variety (said vs. fancy)
Target: ≤10% fancy dialogue tags
totalTags
3
fancyCount
1
fancyTags
0
"Harlow called out (call out)"
dialogueSentences
30
tagDensity
0.1
leniency
0.2
rawRatio
0.333
effectiveRatio
0.067
84.5637%