NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-17-01KHMR9TET3A364QGT4Y8Y89JS
Run Details
Model:
o4 Mini High
Test:
Bad Writing Habits
Scenario:
Mystery: examining a crime scene
Duration:
38.0s
Cost:
$0.026240
Provider:
OpenAI
Model input
System
You are an expert novelist with a gift for vivid, immersive prose. You write in tight third-person limited point of view with a strong narrative voice. Your prose is clean, precise, and engaging — you favor strong verbs, concrete details, and natural-sounding dialogue. You will be given a story bible (codex) containing canonical details about characters, locations, items, and lore. Use these details to ground your writing in the established world. Stay consistent with the codex. Write only the prose of the scene. Do not include titles, chapter headings, markdown formatting, or any commentary before or after the scene.
User/Human
<codex> <character id="4416" name="Detective Harlow Quinn" aliases="Quinn, Detective Quinn, Harlow" pronouns="she/her" species="Human" storyRole="Antagonist"> <eyeColor>brown</eyeColor> <hair>closely cropped salt-and-pepper</hair> <height>5'9"</height> <age>41</age> <jawline>sharp jaw</jawline> <watch>worn leather watch on her left wrist</watch> <bearing>military precision</bearing> <description>A Metropolitan Police detective with 18 years of decorated service who suspects the clique is involved in criminal activity.</description> <backstory>She lost her partner, DS Morris, during a case three years ago under unexplained circumstances that had supernatural origins she doesn't yet understand.</backstory> </character> <character id="4414" name="Eva Kowalski" aliases="Eva, Evie" pronouns="she/her" species="Human" storyRole="Best Friend"> <eyeColor>green</eyeColor> <hair>curly red</hair> <height>5'4"</height> <age>26</age> <glasses>round glasses</glasses> <satchel>worn leather satchel full of books</satchel> <freckles>freckled complexion</freckles> <nervousHabit>tucking hair behind her left ear</nervousHabit> <description>Aurora's childhood friend and an occult researcher.</description> <backstory>Moved to London two years before Aurora and works as a research assistant at the British Museum's restricted archives. She holds a master's degree in Ancient History from Oxford.</backstory> </character> <location id="856" name="The Veil Market" aliases="Veil Market, the Market, underground market" realm="Earth"> <currentLocation>abandoned Tube station beneath Camden</currentLocation> <entryRequirement>bone token</entryRequirement> <schedule>moves locations every full moon</schedule> <description>A hidden supernatural black market that sells enchanted goods, banned alchemical substances, and information.</description> </location> <item id="2004" name="Veil Compass" aliases="the Compass, shadow compass" type="Tool" origin="The Veil Market"> <casingMaterial>small brass compass</casingMaterial> <faceMarkings>face etched with protective sigils</faceMarkings> <needle>points toward the nearest supernatural rift or portal</needle> <patina>casing has a patina of verdigris</patina> <creator>crafted by a Shade artisan</creator> <description>A compass attuned to supernatural energy.</description> </item> </codex> Write a fully fleshed out scene of at least 1000 words in the mystery genre. A detective arrives at a crime scene in an unusual location. Something is wrong — the evidence doesn't add up. A colleague is present, offering a different interpretation. Through careful observation and deduction, the detective begins to see what others have missed. Use the codex entries above as your world reference for characters and setting. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Detective Harlow Quinn stepped off the platform into the hush of the abandoned Camden station. The stench of damp rot and iron-laced air wrapped around her like a shroud. She raised her worn leather
watch
to the meager glow of a flickering bulb: 06:12. Broken tiles crunched beneath her boots. Low light spilled from the tunnel’s yawning mouth, cobwebs swaying in a phantom draft. Under a halo of police tape, the victim lay splayed on cracked concrete. His skin was ashen, veins knotted like black cords beneath the surface. Scorch marks charred his shirt in
intricate
, symmetrical patterns. No soot clung to the fabric; the burns had radiated from within. Eva Kowalski crouched close, her round glasses catching stray beams from her phone’s torch. Freckles danced across her pale cheeks as she traced arcane lines drawn in chalk around the body. She tucked a curly red lock behind her left ear, a nervous habit Quinn already knew. “These sigils… they shift when I move,” Eva whispered. “They’re non-Euclidean.” Quinn knelt beside her, gaze flicking from chalk to corpse. She inhaled
slowly
, tasting copper on the stale air. “Meth lab gone wrong?” she asked, voice edged with skepticism. Eva shook her head. “Chemical burns don’t leave perfect hexagrams or pentagrams. And look—” She’shed her torch across the victim’s chest. The scorch pattern resembled a fractal, each arm radiating into finer branching lines. “It’s ritual. But he didn’t finish the rite.” From her lapel pocket, Quinn drew the Veil Compass. The small brass case bore a verdigris patina, its face
etched
with protective sigils. She
flicked
it open. The needle
quivered
, then steadied, pointing not toward the station exit but back into the service tunnel they had just passed. She snapped the lid shut and slipped it into her jacket. “Forensics will chalk it up to an accidental fire. But we both know nature doesn’t burn geometry this precise.” Eva’s
eyes widened
. “You brought it?” “I suspected I’d need it.” Quinn rose, boots
echo
ing on tile. She circled the chalk ring, eyes scanning. Near the platform edge, she spotted a chipped bone token—ivory, the size of a thumb—clenched in the victim’s right hand. She pried it free. Its surface
was scored
with minute tooth marks. “Bone token,” she said. “Entry fee for the Veil Market.” Eva leaned in. “So he was a buyer. Or a courier. The Market moves every full moon to hidden stations. This stop only appears during the shift.” Quinn studied the token. “He tried to gain entry here. But something went wrong.” Rainwater dribbled from a broken vent, pooling at her heel. She
flicked
the token against her palm, smearing her glove with faint crimson. “Look for more tokens. He wasn’t alone.” Her sharp jaw set. She noted the station’s old sign, half-obscured by graffiti: Camden High Street. The place hadn’t hosted passengers in decades. Tonight, though, it served another purpose. Dark deals, forbidden alchemy, whispered lore. Eva tapped the chalk lines. “If this was a ritual to open a portal, he must have triggered something before he died.” Quinn’s
eyes narrowed
. “We’ve seen portals before, but this…” She bent once more to inspect the burns. “He was incinerated from the inside out. That’s how rifts spit out energy.” A distant clatter
echo
ed behind them. Quinn rose, gloved hand resting on her hip. “Service tunnel,” she whispered. They followed the
echo
into a brick-walled corridor. Quinn’s flashlight beam danced over peeling paint and rat droppings. A plank of timber lay warped against a heavy steel door. Deep gouges scarred its surface. In the dim light, she could make out runes
etched
into the wood, their cuts too fresh to be accidental. Quinn forced the door. It groaned on rusted hinges, revealing a low-ceilinged service chamber. The air here felt warmer, charged. Dust motes swirled as she swept her torch across the floor grate at its center. A charred fragment of bone, a twisted length of copper wire half-melted, lay scattered among ash. Eva’s breath came quick. “This grate—if you open it—” “May lead to whatever he tried to summon.” Quinn crouched, pressing a gloved finger against the metal bars. She sensed a faint pulse, like a thrum of distant drums. The compass in her jacket vibrated, as though eager to point the way. She straightened. “We shut it down now, or he meant this place to be a breach.” Eva swallowed. “We need specialists. SWAT, hazmat, occult containment—” Quinn’s eyes flashed. “Try explaining blood runes to hazmat. No. We decant this ourselves, covertly. If the Eclipse Clique got wind of a police operation, they’d disappear before we arrived.” Eva hesitated, torch light trembling. “You think they set him up?” Quinn nodded. “This token, these glyphs—they’re staged. He wasn’t a rift mage; he was bait. They wanted someone to open it for them.” Overhead, the platform lights
flickered
to life. Floodlights pierced the tunnel’s gloom. Female voices called out. “Detective Quinn? We’re in position—” Quinn rose, dust motes drifting off her jacket. She slipped the compass into Eva’s satchel, zipping it around the books. “Keep it hidden until I call. No scans, no logs.” Eva slid her satchel free and clutched it. Her glasses fogged in the sudden chill. “You’re sure?” Quinn nodded once, then turned toward the service door. “File this as a structural collapse when you talk to them. Inquest says accident, electrical fault. Understood?” Eva exhaled, tucking another curly lock behind her ear. “Understood.” Quinn stepped back onto the platform as forensic officers streamed in, their boot heels clacking a staccato rhythm. Cameras flashed. They knelt by the body, bagged evidence, took statements. A uniformed sergeant saluted. “Morning, Detective.” She inclined her head. “I want a full sweep, 50-meter radius. No unauthorized personnel. And check for any missing person reports matching our victim.” The sergeant tapped his radio. “Yes, ma’am.” Quinn moved out of the crime scene perimeter. Rain still spattered the broken roof above. She surveyed the station—the cracked walls, the empty tracks, the faint hum that
seemed to
rise from the deepest tunnel. Her
watch
ticked on her wrist, worn leather stiff with age. 06:42. Eva joined her, satchel slung over one shoulder. “They’ll seal the service tunnel tonight. For good.” Quinn looked at her friend—hair tousled, cheeks pale but determined. “Then we’ll know where they rerouted.” She glanced at the satchel, as if sensing the compass inside. “Three days until the next full moon. The Veil Market moves then. If they
open that breach
, this station will vanish.” Eva swallowed. “What do we do?” Quinn’s brown eyes glinted. She pulled her leather
watch
strap tighter. “We follow the Market. We find the Clique. And we bring down every rift they try to open.” She paused, voice low. “I’m not losing another partner to this.” The abandoned station breathed around them—tiles whispering,
echo
es drifting. A stray breeze
flicked
a piece of paper across the platform. Quinn bent and picked it up: a tattered union Jack, torn in half. She crushed it in her hand. “We start now,” she said. And beneath the rails, in the dark chamber, something stirred. The faint heartbeat of a hidden world pulsed, waiting.
created_at
"2026-02-17T02:54:42.814836+00:00"
2/17/2026, 2:54:42 AM
model
"openai/o4-mini-high-2025-04-16"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
6303
moderation_latency
(null)
generation_time
37354
tokens_prompt
923
tokens_completion
2863
native_tokens_prompt
835
native_tokens_completion
5755
native_tokens_completion_images
0
native_tokens_reasoning
4032
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0262405
router
(null)
provider_responses
0
id
"resp_030a94deebfb4d5a016993d84c552081958fb4ad46395af13e"
status
200
is_byok
false
latency
137
endpoint_id
"60020533-2fb2-4aa1-9454-181029fd52de"
provider_name
"OpenAI"
model_permaslug
"openai/o4-mini-high-2025-04-16"
api_type
"completions"
id
"gen-1771296844-gIHkgPeSYnjHULgsULmp"
upstream_id
"resp_030a94deebfb4d5a016993d84c552081958fb4ad46395af13e"
total_cost
0.0262405
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
100.0000%
Adverbs in dialogue tags
Target: ≤10% dialogue tags with adverbs
totalTags
5
adverbTagCount
0
adverbTags
(empty)
91.7219%
AI-ism adverb frequency
Target: <1% AI-ism adverbs (57 tracked)
wordCount
1208
totalAiIsmAdverbs
1
found
0
adverb
"slowly"
count
1
highlights
0
"slowly"
100.0000%
AI-ism character names
Target: 0 AI-default names (17 tracked, −20% each)
codexExemptions
(empty)
found
(empty)
100.0000%
AI-ism location names
Target: 0 AI-default location names (33 tracked, −20% each)
codexExemptions
(empty)
found
(empty)
0.6623%
AI-ism word frequency
Target: <1% AI-ism words (140 tracked)
wordCount
1208
totalAiIsms
12
found
0
word
"flickered"
count
1
1
word
"flicked"
count
3
2
word
"quivered"
count
1
3
word
"intricate"
count
1
4
word
"echo"
count
1
5
word
"echoed"
count
1
6
word
"echoes"
count
1
7
word
"echoing"
count
1
8
word
"etched"
count
2
highlights
0
"flickered"
1
"flicked"
2
"quivered"
3
"intricate"
4
"echo"
5
"echoed"
6
"echoes"
7
"echoing"
8
"etched"
66.6667%
Cliche density
Target: ≤1 cliche(s) per 800-word window
totalCliches
2
maxInWindow
2
found
0
label
"eyes widened/narrowed"
count
2
highlights
0
"eyes widened"
1
"eyes narrowed"
100.0000%
Emotion telling (show vs. tell)
Target: ≤3% sentences with emotion telling
emotionTells
0
narrationSentences
99
matches
(empty)
100.0000%
Filter word density
Target: ≤12% sentences with filter/hedge words
filterCount
3
hedgeCount
1
narrationSentences
99
filterMatches
0
"watch"
hedgeMatches
0
"seemed to"
100.0000%
Overuse of "that" (subordinate clause padding)
Target: ≤10% sentences with "that" clauses
thatCount
1
totalSentences
126
matches
0
"open that breach"
100.0000%
Paragraph length variance
Target: CV ≥0.5 for paragraph word counts
totalParagraphs
38
mean
31.42
std
18.6
cv
0.592
sampleLengths
0
64
1
46
2
58
3
29
4
42
5
78
6
6
7
60
8
27
9
44
10
35
11
22
12
30
13
18
14
54
15
51
16
9
17
58
18
9
19
30
20
11
21
23
22
21
23
30
24
17
25
26
26
10
27
35
28
24
29
7
30
47
31
16
32
48
33
6
34
40
35
39
36
5
37
19
100.0000%
Passive voice overuse
Target: ≤5% passive sentences
passiveCount
1
totalSentences
99
matches
0
"was scored"
100.0000%
Past progressive (was/were + -ing) overuse
Target: ≤10% past progressive verbs
pastProgressiveCount
0
totalVerbs
163
matches
(empty)
100.0000%
Purple prose (modifier overload)
Target: <4% adverbs, <2% -ly adverbs, no adj stacking
wordCount
845
adjectiveStacks
0
stackExamples
(empty)
adverbCount
17
adverbRatio
0.020118343195266272
lyAdverbCount
3
lyAdverbRatio
0.0035502958579881655
100.0000%
Repeated phrase echo
Target: ≤20% sentences with echoes (window: 2)
totalSentences
126
echoCount
0
echoWords
(empty)
100.0000%
Sentence length variance
Target: CV ≥0.4 for sentence word counts
totalSentences
126
mean
9.48
std
5.75
cv
0.607
sampleLengths
0
15
1
14
2
15
3
6
4
14
5
13
6
12
7
9
8
12
9
14
10
17
11
16
12
9
13
2
14
10
15
9
16
10
17
4
18
17
19
13
20
8
21
9
22
14
23
4
24
21
25
11
26
19
27
3
28
3
29
11
30
7
31
20
32
4
33
8
34
4
35
6
36
3
37
24
38
4
39
20
40
13
41
7
42
4
43
12
44
7
45
6
46
6
47
5
48
17
49
3
73.3333%
Sentence opener variety
Target: ≥60% unique sentence openers
consecutiveRepeats
0
diversityRatio
0.44
totalSentences
125
uniqueOpeners
55
75.0000%
Dialogue tag variety (said vs. fancy)
Target: ≤30% fancy dialogue tags
totalTags
5
fancyCount
2
fancyTags
0
"Eva whispered (whisper)"
1
"she whispered (whisper)"
88.6697%