NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-17-01KHMQSN5R54B08NM8DGKPXXBR
Run Details
Model:
GPT-4.1
Test:
Bad Writing Habits
Scenario:
Mystery: examining a crime scene
Duration:
26.5s
Cost:
$0.016784
Provider:
OpenAI
Model input
System
You are an expert novelist with a gift for vivid, immersive prose. You write in tight third-person limited point of view with a strong narrative voice. Your prose is clean, precise, and engaging — you favor strong verbs, concrete details, and natural-sounding dialogue. You will be given a story bible (codex) containing canonical details about characters, locations, items, and lore. Use these details to ground your writing in the established world. Stay consistent with the codex. Write only the prose of the scene. Do not include titles, chapter headings, markdown formatting, or any commentary before or after the scene.
User/Human
<codex> <character id="4416" name="Detective Harlow Quinn" aliases="Quinn, Detective Quinn, Harlow" pronouns="she/her" species="Human" storyRole="Antagonist"> <eyeColor>brown</eyeColor> <hair>closely cropped salt-and-pepper</hair> <height>5'9"</height> <age>41</age> <jawline>sharp jaw</jawline> <watch>worn leather watch on her left wrist</watch> <bearing>military precision</bearing> <description>A Metropolitan Police detective with 18 years of decorated service who suspects the clique is involved in criminal activity.</description> <backstory>She lost her partner, DS Morris, during a case three years ago under unexplained circumstances that had supernatural origins she doesn't yet understand.</backstory> </character> <character id="4414" name="Eva Kowalski" aliases="Eva, Evie" pronouns="she/her" species="Human" storyRole="Best Friend"> <eyeColor>green</eyeColor> <hair>curly red</hair> <height>5'4"</height> <age>26</age> <glasses>round glasses</glasses> <satchel>worn leather satchel full of books</satchel> <freckles>freckled complexion</freckles> <nervousHabit>tucking hair behind her left ear</nervousHabit> <description>Aurora's childhood friend and an occult researcher.</description> <backstory>Moved to London two years before Aurora and works as a research assistant at the British Museum's restricted archives. She holds a master's degree in Ancient History from Oxford.</backstory> </character> <location id="856" name="The Veil Market" aliases="Veil Market, the Market, underground market" realm="Earth"> <currentLocation>abandoned Tube station beneath Camden</currentLocation> <entryRequirement>bone token</entryRequirement> <schedule>moves locations every full moon</schedule> <description>A hidden supernatural black market that sells enchanted goods, banned alchemical substances, and information.</description> </location> <item id="2004" name="Veil Compass" aliases="the Compass, shadow compass" type="Tool" origin="The Veil Market"> <casingMaterial>small brass compass</casingMaterial> <faceMarkings>face etched with protective sigils</faceMarkings> <needle>points toward the nearest supernatural rift or portal</needle> <patina>casing has a patina of verdigris</patina> <creator>crafted by a Shade artisan</creator> <description>A compass attuned to supernatural energy.</description> </item> </codex> Write a fully fleshed out scene of at least 1000 words in the mystery genre. A detective arrives at a crime scene in an unusual location. Something is wrong — the evidence doesn't add up. A colleague is present, offering a different interpretation. Through careful observation and deduction, the detective begins to see what others have missed. Use the codex entries above as your world reference for characters and setting. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Detective Harlow Quinn ducked beneath a ribbon of scarlet police tape and stepped onto the cracked platform tiles, boots
echo
ing in the cavernous hush. The old Tube platform sprawled out before her in eerie abandonment—the kind of place where time had gathered dust, and the city above had long ago forgotten what pulsed below. Fluorescent lanterns dangled from jury-rigged scaffolding, casting pale islands of light across the gloom. The air tasted faintly of oil, old stone, and the lingering spark of something else—a scent more felt than smelled, prickling sharp along the back of her tongue. PC Mallory stood near the edge of the platform, notebook clutched tight, her eyes darting everywhere but at the mess at her feet. "Morning, Detective Quinn," she said, voice brittle. "You, uh, took the service lift?" "Stairs," Harlow replied, scrubbing fatigue from her eyes. She rolled her left sleeve just enough to check her worn leather
watch
: not even dawn. Shifts like these had grown familiar over the years, but the strangeness never softened. "What have we got?" "Unidentified, female. Found by maintenance staff on the first check of the day," Mallory said, guiding her closer. "No ID. No signs of forced entry topside." A sheet of plastic covered the body, corners weighted by old train parts. Blood had wept across the tiles in a slow, careful arc, as if poured rather than spilled. The metallic tang gnawed at the back of Quinn's throat. Harlow crouched, the movement crisp, deliberate, her eyes narrowing at the pattern beneath the plastic. "Let's have a look." Mallory pulled back the sheet. The woman was young, maybe late twenties or thirties, limbs composed in unsettling order. Curls of red hair escaped from her scalp—already sticky with blood, freckles on a pale cheek spotted like iron in snow. Harlow fixed on her hands, palms up, fingers relaxed. No defensive wounds. "Face look familiar?" Mallory asked. "I've never seen her." Quinn said nothing. She leaned closer, noting the thin, precise line drawn across the victim’s throat—too shallow for the arterial spray that should have painted the tiles. Her uniform’s cuff brushed something near the body: a tiny brass object, mottled green with age. She glanced at it, then swept her gloved hand over the area. "Bag that," she ordered quietly, nodding to the evidence technician. The object—small, flickering in the lamplight—looked like an old compass,
etched
with unfamiliar markings. "Robbery gone bad?" Mallory hazarded. "This place draws all sorts, and we found her bag nearby, empty." "Maybe," Harlow said, but her tone was flat. She stood again, rolling her shoulders, scanning past the rails into the dark that str
etched
beyond the station’s edge. Mallory pressed on, as if trying to fill the hush with reason. "We’ve seen rough sleepers down here, taggers—maybe she surprised someone desperate. Platform’s a blind spot for CCTV. Could have come down anytime after midnight." Harlow swept her torch along the platform, tracking grime and scattered footprints. The cold clung to everything; she could see her breath in the narrow beam. But the victim’s clothes looked too clean for someone who’d crawled through the city’s guts. Expensive boots, laces tight, a dark skirt, the hem unsullied by the filth caked around the rails. A stub of a train ticket in one pocket—outbound, date still fresh. She tucked two fingers under the edge of the woman’s lapel and lifted, careful. There: a mark on the throat. No, not a bruise. A fine sigil, barely perceptible—a shallow
etching
that caught the light as she shifted her angle. Instinct prickled down Harlow’s spine. She swallowed, throat dry. "Something on her neck," she murmured. "Get the forensic specialist, now." Mallory frowned, bending in for a look. "What
is that? Tattoo
?" "Not a tattoo. Etched, maybe with a scalpel." "Ritual?" The word dropped like a pebble into water—ripples moving underneath the ordinary tension. Behind them, the evidence tech held up the brass compass. "Detective, you want this logged separately? Never seen one like it." Quinn took it, weighing it in her gloved palm. The casing was warm, somehow, despite the chill—a patina of verdigris coloring the edges. She
flicked
her thumb over its lid; it popped open. Inside, the needle spun jerkily, refusing to settle on north. The face
was etched
all over with sigils, sharp and protective—a language that wasn’t for mortals. Mallory peered over her shoulder. "Weird old trinket?" "Maybe," Quinn said again. But her mind
was racing
ahead, piecing what didn’t fit. No ID, yet the woman hadn’t been stripped or hastily robbed. Her eyes swept the perimeter: a faint line of powder—bone white—traced just beyond the victim, in an uneven circle. Ritual again. The compass, the sigil, the powder. A tableau staged, not a crime of chance. Mallory shifted from foot to foot. "Detective? You think this was... what, some cult thing?" She
tried to
flatten the tremor in her tone, but years on patrol hadn’t prepared her for anything like this. Harlow let her gaze travel from the circle, to the body, to the old compass still in her hand. She traced her memory, trying to dredge up anything from briefings, old cases, the strange late-night files she dug through alone when the job pressed past reason. There were stories—always stories—about the city’s underbelly folding in on itself. About markets that moved with the moon, places you found when the world was thin. "Too careful for a mugging," she said finally. "Too public for a gang hit. Ritual, maybe, or message. But what kind?" Mallory’s jaw tightened. "Superintendent isn’t going to
like that answer
." Quinn ignored her, brushing a hand over the white powder. It clung electric to the glove. Bone-dust, probably—not chalk. She’d seen it once before, on a case that went wrong, one that pulled her partner Morris away. The
echo
of his absence was sudden, biting. She rose, holding the compass up so its face caught the pale light. The needle twitched, jittery, as if seeking something in the dark. "Get the rest of the platform cordoned off. Forensics, now. And see what you can dig up about ritual circles and missing persons in Camden the last forty-eight hours. I want names—and bodies—if there are any others." Mallory dipped her head and left, relief masked as briskness. Harlow paced slow arcs, letting her mind hunt through details. No signs of a struggle. The hair neat, blood flow coaxed not spilled, the ritual marks. The brass compass—a tool, not sentiment; the way it vibrated so faintly, it might have been alive. An artifact? Or a key? She circled back to the edge of the platform. The air was different there—heavier, as if she’d stepped into the mouth of a thunderstorm. She knelt, running her torch over the rails. At the junction where dark oil met stone, the grime swirled inward, as if tugged by a pressure point. Looking past the vestiges of ordinary violence, Harlow saw a pattern. The powder was continuous, but broken
deliberately
—right where the compass had lain. As if someone had placed the object to anchor the circle, then lifted it before they left. The circle, the sigil, the strange artifact: every element pointed somewhere beyond the reach of ordinary crime. She straightened, boots crunching gravel. Above her, the station’s archways yawned into shadow. With practiced calm, she pressed her comm. "This wasn’t random," she said. "Our victim came here for a reason, and she wasn’t alone." Nor was she, Harlow realized. The comm crackled in the hush, distant sirens keening somewhere far above—mundane reminders of a world that refused to look under its own feet. Harlow closed her fist around the compass,
watch
ing as the needle
quivered
, uncertain, as if pointing down a path only she could see. Something moved in the darkness—a trick of the light, or the barest shift in air. She thought of Morris, lost in some rift she’d never understood, and the way these cases always wrote their own rules. She inhaled, slow and steady. The city above held its breath. Down here, beneath Camden and the world’s attention, there were crimes that didn’t belong to the living. And Harlow Quinn, jaw set, shoulders squared, was ready to ask the questions no one else dared form aloud.
created_at
"2026-02-17T02:45:41.781926+00:00"
2/17/2026, 2:45:41 AM
model
"openai/gpt-4.1-2025-04-14"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
412
moderation_latency
(null)
generation_time
26396
tokens_prompt
923
tokens_completion
2165
native_tokens_prompt
836
native_tokens_completion
1889
native_tokens_completion_images
0
native_tokens_reasoning
0
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.016784
router
(null)
provider_responses
0
id
"resp_09a527e628f447fe016993d63aa670819387b7650e209da386"
status
200
is_byok
false
latency
150
endpoint_id
"c235abe8-11cc-42d3-95ad-72f4d198287a"
provider_name
"OpenAI"
model_permaslug
"openai/gpt-4.1-2025-04-14"
api_type
"completions"
id
"gen-1771296314-Capmd7YRkLmzvmqXfqac"
upstream_id
"resp_09a527e628f447fe016993d63aa670819387b7650e209da386"
total_cost
0.016784
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
33.3333%
Adverbs in dialogue tags
Target: ≤10% dialogue tags with adverbs
totalTags
10
adverbTagCount
3
adverbTags
0
"she ordered quietly, nodding to the evidence technician [quietly]"
1
"Quinn said again [again]"
2
"she said finally [finally]"
92.8520%
AI-ism adverb frequency
Target: <1% AI-ism adverbs (57 tracked)
wordCount
1399
totalAiIsmAdverbs
1
found
0
adverb
"deliberately"
count
1
highlights
0
"deliberately"
100.0000%
AI-ism character names
Target: 0 AI-default names (17 tracked, −20% each)
codexExemptions
(empty)
found
(empty)
100.0000%
AI-ism location names
Target: 0 AI-default location names (33 tracked, −20% each)
codexExemptions
(empty)
found
(empty)
42.8163%
AI-ism word frequency
Target: <1% AI-ism words (140 tracked)
wordCount
1399
totalAiIsms
8
found
0
word
"flicked"
count
1
1
word
"quivered"
count
1
2
word
"echo"
count
1
3
word
"echoing"
count
1
4
word
"etched"
count
3
5
word
"etching"
count
1
highlights
0
"flicked"
1
"quivered"
2
"echo"
3
"echoing"
4
"etched"
5
"etching"
100.0000%
Cliche density
Target: ≤1 cliche(s) per 800-word window
totalCliches
0
maxInWindow
0
found
(empty)
highlights
(empty)
100.0000%
Emotion telling (show vs. tell)
Target: ≤3% sentences with emotion telling
emotionTells
0
narrationSentences
101
matches
(empty)
100.0000%
Filter word density
Target: ≤12% sentences with filter/hedge words
filterCount
1
hedgeCount
1
narrationSentences
101
filterMatches
0
"watch"
hedgeMatches
0
"tried to"
100.0000%
Overuse of "that" (subordinate clause padding)
Target: ≤10% sentences with "that" clauses
thatCount
2
totalSentences
117
matches
0
"is that? Tattoo"
1
"like that answer"
100.0000%
Paragraph length variance
Target: CV ≥0.5 for paragraph word counts
totalParagraphs
38
mean
36.08
std
20.66
cv
0.573
sampleLengths
0
96
1
36
2
42
3
26
4
40
5
19
6
52
7
9
8
55
9
24
10
17
11
27
12
36
13
70
14
49
15
11
16
11
17
8
18
14
19
21
20
59
21
8
22
60
23
35
24
72
25
21
26
10
27
45
28
24
29
37
30
10
31
48
32
51
33
57
34
36
35
52
36
36
37
47
100.0000%
Passive voice overuse
Target: ≤5% passive sentences
passiveCount
1
totalSentences
101
matches
0
"was etched"
100.0000%
Past progressive (was/were + -ing) overuse
Target: ≤10% past progressive verbs
pastProgressiveCount
1
totalVerbs
200
matches
0
"was racing"
94.0000%
Purple prose (modifier overload)
Target: <4% adverbs, <2% -ly adverbs, no adj stacking
wordCount
1185
adjectiveStacks
1
stackExamples
0
"far above—mundane reminders"
adverbCount
41
adverbRatio
0.03459915611814346
lyAdverbCount
11
lyAdverbRatio
0.009282700421940928
100.0000%
Repeated phrase echo
Target: ≤20% sentences with echoes (window: 2)
totalSentences
117
echoCount
0
echoWords
(empty)
100.0000%
Sentence length variance
Target: CV ≥0.4 for sentence word counts
totalSentences
117
mean
11.72
std
6.88
cv
0.587
sampleLengths
0
24
1
30
2
14
3
28
4
23
5
7
6
6
7
8
8
16
9
14
10
4
11
18
12
8
13
13
14
17
15
10
16
15
17
4
18
5
19
14
20
21
21
9
22
3
23
5
24
4
25
3
26
24
27
16
28
12
29
10
30
14
31
5
32
12
33
8
34
19
35
12
36
24
37
12
38
14
39
15
40
17
41
12
42
14
43
6
44
4
45
16
46
5
47
4
48
6
49
5
74.3590%
Sentence opener variety
Target: ≥60% unique sentence openers
consecutiveRepeats
3
diversityRatio
0.46153846153846156
totalSentences
117
uniqueOpeners
54
0.0000%
Dialogue tag variety (said vs. fancy)
Target: ≤30% fancy dialogue tags
totalTags
10
fancyCount
7
fancyTags
0
"Harlow replied, scrubbing fatigue from her eyes (reply, scrub)"
1
"Mallory said, guiding her closer (say, guide)"
2
"she ordered quietly, nodding to the evidence technician (order quietly, nod)"
3
"Harlow said, but her tone was flat (say, be)"
4
"she murmured (murmur)"
5
"Quinn said again (say again)"
6
"she said finally (say finally)"
84.5506%