NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-17-01KHMRBF8Z8TSR39SNJFRFWHZG
Run Details
Model:
GPT-4.1
Test:
Bad Writing Habits
Scenario:
Mystery: examining a crime scene
Duration:
46.8s
Cost:
$0.017624
Provider:
OpenAI
Model input
System
You are an expert novelist with a gift for vivid, immersive prose. You write in tight third-person limited point of view with a strong narrative voice. Your prose is clean, precise, and engaging — you favor strong verbs, concrete details, and natural-sounding dialogue. You will be given a story bible (codex) containing canonical details about characters, locations, items, and lore. Use these details to ground your writing in the established world. Stay consistent with the codex. Write only the prose of the scene. Do not include titles, chapter headings, markdown formatting, or any commentary before or after the scene.
User/Human
<codex> <character id="4416" name="Detective Harlow Quinn" aliases="Quinn, Detective Quinn, Harlow" pronouns="she/her" species="Human" storyRole="Antagonist"> <eyeColor>brown</eyeColor> <hair>closely cropped salt-and-pepper</hair> <height>5'9"</height> <age>41</age> <jawline>sharp jaw</jawline> <watch>worn leather watch on her left wrist</watch> <bearing>military precision</bearing> <description>A Metropolitan Police detective with 18 years of decorated service who suspects the clique is involved in criminal activity.</description> <backstory>She lost her partner, DS Morris, during a case three years ago under unexplained circumstances that had supernatural origins she doesn't yet understand.</backstory> </character> <character id="4414" name="Eva Kowalski" aliases="Eva, Evie" pronouns="she/her" species="Human" storyRole="Best Friend"> <eyeColor>green</eyeColor> <hair>curly red</hair> <height>5'4"</height> <age>26</age> <glasses>round glasses</glasses> <satchel>worn leather satchel full of books</satchel> <freckles>freckled complexion</freckles> <nervousHabit>tucking hair behind her left ear</nervousHabit> <description>Aurora's childhood friend and an occult researcher.</description> <backstory>Moved to London two years before Aurora and works as a research assistant at the British Museum's restricted archives. She holds a master's degree in Ancient History from Oxford.</backstory> </character> <location id="856" name="The Veil Market" aliases="Veil Market, the Market, underground market" realm="Earth"> <currentLocation>abandoned Tube station beneath Camden</currentLocation> <entryRequirement>bone token</entryRequirement> <schedule>moves locations every full moon</schedule> <description>A hidden supernatural black market that sells enchanted goods, banned alchemical substances, and information.</description> </location> <item id="2004" name="Veil Compass" aliases="the Compass, shadow compass" type="Tool" origin="The Veil Market"> <casingMaterial>small brass compass</casingMaterial> <faceMarkings>face etched with protective sigils</faceMarkings> <needle>points toward the nearest supernatural rift or portal</needle> <patina>casing has a patina of verdigris</patina> <creator>crafted by a Shade artisan</creator> <description>A compass attuned to supernatural energy.</description> </item> </codex> Write a fully fleshed out scene of at least 1000 words in the mystery genre. A detective arrives at a crime scene in an unusual location. Something is wrong — the evidence doesn't add up. A colleague is present, offering a different interpretation. Through careful observation and deduction, the detective begins to see what others have missed. Use the codex entries above as your world reference for characters and setting. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Detective Harlow Quinn ducked through the low arch of a rusted service tunnel, boot soles rasping over concrete wreathed in condensation. She paused and exhaled. Down here, beneath Camden, even London’s relentless drizzle couldn’t reach—the fog was something else, a chemical miasma the city
tried to
forget. Harlow’s sharp jaw set, her gaze cutting through the vapor. The air teemed with the wet metallic scent of old rails and something spicier—a trace of incense, oddly out of place. “Down here, Quinn,” called Sergeant
Patel
, his voice bouncing down the tunnel. He stood in the pool of yellow light thrown by a portable lamp, hunched in a government-issue parka, notepad in one gloved hand. Even from twenty feet off, he carried the nervous energy of a man who’d seen too much weirdness. He beckoned her around a battered metal gate. The chamber yawned open before her. The abandoned Tube station had
been gutted
decades past, stripped to concrete ribs and old advert boards curled like scorched petals. But tonight, those walls
were changed
—painted with sigils, delicate and curling, drawn in something that glimmered in the torch glare. The Veil Market, she thought; she’d heard whispers about this place in the shadowed corners of the city’s rumor mill. She’d never expected its entrance to surface in her jurisdiction. Harlow reached for the comfort of her worn leather
watch
and glanced down. 2:34 a.m. She stepped forward, measuring the scene with eyes as appraising and dark as fresh-turned earth. Her every movement held the muscle memory of nineteen years in service—back straight, gaze steady, analyzing.
Patel
spoke in a hush. “First train operator on the maintenance route smelled something funny. Found…that.” He pointed to the body sprawling across the platform. It was a young woman, no older than her late twenties. Blonde, expensive trainers, torn tights. There was a chill over her, the stillness not just of death, but something expectant, as though the shadows lingered to
watch
. Her face was pale and peaceful, almost as if she’d merely lain down for a nap, but there was a thin line of blood beneath each nostril. Her hands
were curled
loosely
at her sides, palms up, displaying strange silver dust clinging to the skin. Harlow crouched. “What’d you make of this?” she asked, motioning
Patel
closer. He shifted, squinting. “Looks like a bad reaction to something—drug deal gone wrong.” He gestured at a scattered vial near the woman’s feet. “Maybe a new street synth.” Harlow grunted. She reached for a pen and, careful not to disturb the evidence, flipped the vial with its cap. The label was hand-stamped: ALK2. It was half-full of powder that
shimmered
—opalescent, unlikely. “You see this type of packaging before?” she asked.
Patel
hesitated. “Could be a knockoff. There’s that pop-up chemist ring—we saw similar glass at that raid in Dalston.” But his words didn’t convince even himself. Harlow surveyed the periphery. Beyond the body, the platform
was divided
by more of those sigils, some fresh, some half-scraped away. A narrow silver chain,
etched
with symbols, bisected the floor in a careful line—a perimeter, not decoration. On the far end, boxes had
been stacked
in makeshift booths: discarded bits of
velvet
, cracked trays, the detritus of commerce that shouldn’t exist above ground. Tiny bones scattered the floor—a child’s knucklebones, at a glance—lashed on strings. She straightened, scanning the rails. “Security footage?” she asked.
Patel
shook his head. “Not down here. Place’s been off the grid since seventy-six. No working CCTVs.” “Figures.” She knelt again and lifted the woman’s wrist—gingerly, mindful of the chain. Noticed faint imprints on the skin, like pressing into damp sand—a bracelet, recently removed. Something heavier than jewelry. Harlow’s brows knit. “Your theory, Quinn?”
Patel
prompted, scribbling in his pad. In his tight expression, she saw hope—he wanted a simple answer. “Not drugs,” Harlow said. “Look at her face—there’s no sign of struggle, and these,” she gestured to the silvery powder and the faint marks on skin, “are ritual, not residue. Whoever did this knew what they were doing.” She pointed to the chain. “Containment. Like she was being kept inside—a cage.”
Patel
squinted. “So…what, occult stuff? Some kind of cult gone wrong?” Harlow nodded. “Or gone right, depending on their intentions.” She circled the body. From this new angle, the patterns on the tiles became clearer—protective wards, she realized, not invitations. A realization prickled down her spine. Then she noticed it: a smear of verdigris green
glinting
under the platform’s edge—brass, untouched by the years, though this place had nothing left but rot. She nudged it with her gloved finger. A small brass compass, the casing
etched
with looping sigils, sat in the dust, heavy. The instrument was unfamiliar, the needle quivering even though no
magnetic
pull should reach down here. It twitched, always to the station’s far wall. Harlow’s breathing tightened.
Patel
peered closer. “Compass? Odd place for it.” Harlow didn’t answer. Her memory turned over details: the rumors, her partner’s disappearance, the cases gone cold for no reason that ever made sense. Supernatural, some whispered. She’d never believed in magic—but she’d learned to trust patterns. The compass, the chains, the circle—it all spoke of energies bottled beneath the city, old and angry. “We need to sweep for more artifacts,” she said, handing
Patel
a sealed evidence bag. “And DNA swabs. But tell forensics to look for trace elements—silver, salt, minerals—not the usual narcotics kit.” She glanced up, eyes narrowing. “Contact the occult liaison at the Museum. Kowalski, if she’s on shift. Someone who knows what these marks mean.”
Patel
’s face twitched at the name. “The redhead? Is she still doing side work for us?” “She’s the best we’ve
got.” That much
, Quinn knew as gospel. “And tell the team: Only those with clearance enter from now on. No souvenirs. No risks.” Her tone had the clipped edge of a drill sergeant, and it brooked no questions.
Patel
moved off, phone already in hand. The chamber receded into silence but for the soft hum of the tunnel’s living, breathing dark. Harlow’s thoughts sidestepped methodically—details before doubts. The sigils formed a language she couldn’t read, but she’d learned from experience: wards
were drawn
out of fear, not faith. Whatever haunted this market, someone had tried—desperately—to hold it at bay. And now, the lock had failed. She crouched by the body once more, turning her attention to the young woman’s face. No obvious signs of violence, only those fragile, unexplained smears of blood, the silver dust. She studied the pupils: ever so faintly, not quite round. Elongated. Not human, not animal—unnatural. Anxiety crawled along her scalp. Footsteps clicked from the direction of the tunnel mouth. Medical examiner, arriving late—sallow, out of breath. Harlow barely acknowledged him. She kept scanning the periphery as the man began his perfunctory notes. There—something half-concealed under a pile of market flotsam. Harlow stepped over: a bone token, worn smooth, engraved with much the same symbols she’d seen on the chain. It fit neatly in her palm, light as air. Why would a buyer or seller drop their en
try to
ken? No, she realized, this
was left
deliberately
, a clue or a warning. Even in death, the body on the floor had been part of a transaction: the Market always took its price. Quinn’s mind tracked the logic, following the chain: containment, an artifact, an en
try to
ken, and a ritual meant to hold something other than a human.
Patel
returned, Eva Kowalski reportedly en route. Harlow strode to the far wall, where the compass needle
quivered
most violently. The sigils here glowed faint, as if reflecting moonlight that never reached the city’s bowels. Fresh scrape marks marred the concrete, revealing a sliding panel—a concealed doorway never meant to be used by mortals. She ducked to examine the base. Scorch marks, black as pitch, radiated outward like tiny veins. “
Patel
,” she called quietly. “Nobody touches this wall until I say so. And nobody stands closer than three feet. Understood?” He nodded. “What exactly are you expecting to find, Quinn?” Her expression was bleak. “Whatever was behind that door. And whatever made the Market run tonight, even after death.” Her voice was steel—measured, certain, refusing superstition but refusing denial all the same. The silence returned, thick with the
weight
of a secret world. Harlow Quinn stood
sentinel
beneath Camden, her shadow sharp against the painted sigils,
watch
ing—always
watch
ing—for evidence nobody else would see, waiting for the pattern to resolve into a truth she could finally, mercilessly, name.
created_at
"2026-02-17T02:55:45.4457+00:00"
2/17/2026, 2:55:45 AM
model
"openai/gpt-4.1-2025-04-14"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
286
moderation_latency
(null)
generation_time
46737
tokens_prompt
923
tokens_completion
2346
native_tokens_prompt
836
native_tokens_completion
1994
native_tokens_completion_images
0
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.017624
router
(null)
provider_responses
0
id
"resp_0df0c883f48343e1016993d88268c48196a3a747a42de2a5f4"
status
200
is_byok
false
latency
117
endpoint_id
"c235abe8-11cc-42d3-95ad-72f4d198287a"
provider_name
"OpenAI"
model_permaslug
"openai/gpt-4.1-2025-04-14"
api_type
"completions"
id
"gen-1771296898-lcFCTSJVBQeYz8q965xR"
upstream_id
"resp_0df0c883f48343e1016993d88268c48196a3a747a42de2a5f4"
total_cost
0.017624
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
91.6667%
Adverbs in dialogue tags
Target: ≤10% dialogue tags with adverbs
totalTags
8
adverbTagCount
1
adverbTags
0
"she called quietly [quietly]"
86.2069%
AI-ism adverb frequency
Target: <1% AI-ism adverbs (57 tracked)
wordCount
1450
totalAiIsmAdverbs
2
found
0
adverb
"deliberately"
count
1
1
adverb
"loosely"
count
1
highlights
0
"deliberately"
1
"loosely"
80.0000%
AI-ism character names
Target: 0 AI-default names (17 tracked, −20% each)
codexExemptions
(empty)
found
0
"Patel"
100.0000%
AI-ism location names
Target: 0 AI-default location names (33 tracked, −20% each)
codexExemptions
(empty)
found
(empty)
31.0345%
AI-ism word frequency
Target: <1% AI-ism words (140 tracked)
wordCount
1450
totalAiIsms
10
found
0
word
"sentinel"
count
1
1
word
"shimmered"
count
1
2
word
"glinting"
count
1
3
word
"quivered"
count
1
4
word
"velvet"
count
1
5
word
"magnetic"
count
1
6
word
"weight"
count
1
7
word
"footsteps"
count
1
8
word
"etched"
count
2
highlights
0
"sentinel"
1
"shimmered"
2
"glinting"
3
"quivered"
4
"velvet"
5
"magnetic"
6
"weight"
7
"footsteps"
8
"etched"
100.0000%
Cliche density
Target: ≤1 cliche(s) per 800-word window
totalCliches
0
maxInWindow
0
found
(empty)
highlights
(empty)
100.0000%
Emotion telling (show vs. tell)
Target: ≤3% sentences with emotion telling
emotionTells
0
narrationSentences
109
matches
(empty)
100.0000%
Filter word density
Target: ≤12% sentences with filter/hedge words
filterCount
1
hedgeCount
3
narrationSentences
109
filterMatches
0
"watch"
hedgeMatches
0
"tried to"
1
"try to"
100.0000%
Overuse of "that" (subordinate clause padding)
Target: ≤10% sentences with "that" clauses
thatCount
1
totalSentences
118
matches
0
"got.” That much"
100.0000%
Paragraph length variance
Target: CV ≥0.5 for paragraph word counts
totalParagraphs
34
mean
41.5
std
23.09
cv
0.556
sampleLengths
0
78
1
61
2
77
3
46
4
25
5
85
6
10
7
28
8
42
9
26
10
76
11
9
12
17
13
34
14
20
15
51
16
11
17
35
18
75
19
8
20
54
21
56
22
16
23
42
24
67
25
50
26
32
27
36
28
67
29
70
30
20
31
10
32
32
33
45
90.5199%
Passive voice overuse
Target: ≤5% passive sentences
passiveCount
7
totalSentences
109
matches
0
"been gutted"
1
"were changed"
2
"were curled"
3
"was divided"
4
"been stacked"
5
"were drawn"
6
"was left"
100.0000%
Past progressive (was/were + -ing) overuse
Target: ≤10% past progressive verbs
pastProgressiveCount
0
totalVerbs
206
matches
(empty)
100.0000%
Purple prose (modifier overload)
Target: <4% adverbs, <2% -ly adverbs, no adj stacking
wordCount
1198
adjectiveStacks
0
stackExamples
(empty)
adverbCount
37
adverbRatio
0.030884808013355594
lyAdverbCount
18
lyAdverbRatio
0.015025041736227046
100.0000%
Repeated phrase echo
Target: ≤20% sentences with echoes (window: 2)
totalSentences
118
echoCount
0
echoWords
(empty)
100.0000%
Sentence length variance
Target: CV ≥0.4 for sentence word counts
totalSentences
118
mean
11.96
std
7.79
cv
0.652
sampleLengths
0
21
1
4
2
22
3
10
4
21
5
12
6
23
7
18
8
8
9
6
10
21
11
20
12
20
13
10
14
13
15
2
16
15
17
16
18
5
19
20
20
11
21
5
22
22
23
27
24
18
25
2
26
10
27
3
28
20
29
5
30
2
31
18
32
5
33
8
34
9
35
2
36
24
37
4
38
17
39
17
40
26
41
12
42
5
43
4
44
4
45
13
46
13
47
14
48
4
49
3
71.7514%
Sentence opener variety
Target: ≥60% unique sentence openers
consecutiveRepeats
2
diversityRatio
0.4406779661016949
totalSentences
118
uniqueOpeners
52
50.0000%
Dialogue tag variety (said vs. fancy)
Target: ≤30% fancy dialogue tags
totalTags
8
fancyCount
4
fancyTags
0
"called Sergeant Patel, his voice bouncing down the tunnel (call bounce)"
1
"she gestured to the silvery powder and the faint marks on skin, She pointed to the chain (gesture point)"
2
"she said, handing Patel a sealed evidence bag (say, hand)"
3
"she called quietly (call quietly)"
88.3047%