NC Bench
Overview
About
Tests
Models
Vendors
Run: 2026-02-17-01KHMR1CVPNNXD7A2TE7BD2KB5
Run Details
Model:
o4 Mini
Test:
Bad Writing Habits
Scenario:
Mystery: examining a crime scene
Duration:
19.0s
Cost:
$0.013194
Provider:
OpenAI
Model input
System
You are an expert novelist with a gift for vivid, immersive prose. You write in tight third-person limited point of view with a strong narrative voice. Your prose is clean, precise, and engaging — you favor strong verbs, concrete details, and natural-sounding dialogue. You will be given a story bible (codex) containing canonical details about characters, locations, items, and lore. Use these details to ground your writing in the established world. Stay consistent with the codex. Write only the prose of the scene. Do not include titles, chapter headings, markdown formatting, or any commentary before or after the scene.
User/Human
<codex> <character id="4416" name="Detective Harlow Quinn" aliases="Quinn, Detective Quinn, Harlow" pronouns="she/her" species="Human" storyRole="Antagonist"> <eyeColor>brown</eyeColor> <hair>closely cropped salt-and-pepper</hair> <height>5'9"</height> <age>41</age> <jawline>sharp jaw</jawline> <watch>worn leather watch on her left wrist</watch> <bearing>military precision</bearing> <description>A Metropolitan Police detective with 18 years of decorated service who suspects the clique is involved in criminal activity.</description> <backstory>She lost her partner, DS Morris, during a case three years ago under unexplained circumstances that had supernatural origins she doesn't yet understand.</backstory> </character> <character id="4414" name="Eva Kowalski" aliases="Eva, Evie" pronouns="she/her" species="Human" storyRole="Best Friend"> <eyeColor>green</eyeColor> <hair>curly red</hair> <height>5'4"</height> <age>26</age> <glasses>round glasses</glasses> <satchel>worn leather satchel full of books</satchel> <freckles>freckled complexion</freckles> <nervousHabit>tucking hair behind her left ear</nervousHabit> <description>Aurora's childhood friend and an occult researcher.</description> <backstory>Moved to London two years before Aurora and works as a research assistant at the British Museum's restricted archives. She holds a master's degree in Ancient History from Oxford.</backstory> </character> <location id="856" name="The Veil Market" aliases="Veil Market, the Market, underground market" realm="Earth"> <currentLocation>abandoned Tube station beneath Camden</currentLocation> <entryRequirement>bone token</entryRequirement> <schedule>moves locations every full moon</schedule> <description>A hidden supernatural black market that sells enchanted goods, banned alchemical substances, and information.</description> </location> <item id="2004" name="Veil Compass" aliases="the Compass, shadow compass" type="Tool" origin="The Veil Market"> <casingMaterial>small brass compass</casingMaterial> <faceMarkings>face etched with protective sigils</faceMarkings> <needle>points toward the nearest supernatural rift or portal</needle> <patina>casing has a patina of verdigris</patina> <creator>crafted by a Shade artisan</creator> <description>A compass attuned to supernatural energy.</description> </item> </codex> Write a fully fleshed out scene of at least 1000 words in the mystery genre. A detective arrives at a crime scene in an unusual location. Something is wrong — the evidence doesn't add up. A colleague is present, offering a different interpretation. Through careful observation and deduction, the detective begins to see what others have missed. Use the codex entries above as your world reference for characters and setting. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
Detective Harlow Quinn’s boots clicked against slick tile as she climbed down the graffiti-streaked steps into the abandoned Tube station. Her leather
watch
, worn soft at the edges, caught the glow of her torchbeam. The walls dripped with condensation, and the air tasted of damp earth and something metallic she couldn’t name. She paused at the bottom, shoulders coiled, scanning the cavernous platform where yellow tape
fluttered
in an unseen breeze. “Morning, Quinn.” Eva Kowalski stepped from behind a tilted sign advertising a defunct line. Her round glasses caught the beam. Red curls spilled over her shoulders as she tucked a lock behind her freckled ear—always that nervous gesture. She cradled her leather satchel, the strap worn to pale threads. “I’m glad you’re here. This—” she nodded at the far end of the platform, “—is beyond anything I’ve seen in the archives.” Quinn nodded once, crisp. “Let’s see it.” She walked past two forensics officers crouched beside a body, their lamps
dancing
across a chalk outline stenciled on grimy tile. The corpse lay face-up, arms splayed at impossible angles. The skin was pale, almost translucent, veins like dark threads beneath. A circular pattern of burned bone fragments framed the head, charred to white ash. Even from a distance, Quinn tasted the wrongness. DS Harper cleared his throat. “Victim’s male, mid-thirties. No ID on him. No wounds, no blood—apart from what looks like—” He
flicked
at a droplet pooling at the base of the skull. “—a small laceration. Cause of death: still undetermined.” Quinn knelt, razor-sharp jaw set. The eyelids
were flicked
back, exposing glassy whites. Each fingernail
was filed
to a needle point. On the victim’s chest lay a tarnished brass object, half-buried in ash. She reached for it, gloves snapping on. Eva hovered, clasping her satchel straps. Quinn lifted the compass. The casing’s patina was deep green in the torchlight. The needle
quivered
, then spun in a wide circle before stabbing toward the tunnel entrance. She turned it over. Protective sigils
etch
ed into the back—hand-drawn, wobbly. Not professional. “Veil Compass,” Eva whispered. “They’re sold at the Market. Attuned to supernatural rifts.” She swallowed. “If it was real—” “Real?” Quinn cut her off, her voice low. She studied the victim’s features: the faint indentation where a bone token might have sat against skin. No token. “No bone token, Eva. How did he get through the Market’s gates?” Eva’s eyes darted to the circle of ash. “Maybe he didn’t. Maybe someone staged a ritual here, mimicking the Market’s rites. Or a cretin stole a token from—“ “A stolen token still has to be carved right, activated. You get one at the Market, it’s keyed to your phalanx bone.” Quinn’s gaze swept the platform: broken tiles, streaks of mud, footprints leading to an intact section of wall. Beneath faded paint, she spotted a half-erased sigil scrawled in charcoal. “Look at this. The genuine Market sigils are precise, uniform. This is amateur hour.” DS Harper approached, tapping his tablet. “Witness stated they heard chanting around midnight. Saw a man in a skull mask drop this compass and run. Says the chanting sounded… Latin?” Latin. Quinn's jaw tightened. She knew Latin—Morris had taught her. She recalled the emergency call from a month ago:
echoes
in an old church. Could the same force be here? But the compass should have locked onto any surge. Instead it spun. She stood. The station lights had long since collapsed; their torches cast stilted shadows. The compass needle pivoted again, pointing now toward a bricked-up service tunnel. Cold air spilled from the seams. Eva’s breath caught. “That can’t be open. That tunnel’s supposed to be sealed.” Quinn rose, tapping the compass against her palm. “Supernatural or not, someone’s using these symbols to mislead. Maybe they wanted us chasing portals while the real murderer slipped away.” She pocketed the compass. “Harper, get me digital scans of this station’s original blueprints. Every sealed wall, every maintenance crawlspace. I want footprints dusted. I want trace fiber analysis on that white ash.” Harper nodded, already typing. Eva lingered at Quinn’s side. “You think it’s just a cover-up?” she asked, voice small. Quinn inhaled, squinting at the circle of ash. “No one performs a midnight skull-ritual without a purpose. But real or staged, the body was killed elsewhere. Cured, cooked, brought here. No blood. No struggle.” She crouched, running a finger through the ash. It felt wrong: too brittle, like bone shattered then re-burned. She
flicked
her torch at the grout lines. In one corner, the ash gathered thicker, as if the killer brushed against the wall. “Look at the toe scuffs. These aren’t ritual steps—they’re someone slipping, catching
weight
, stepping back.” Eva crouched beside her, satchel sliding. She unzipped a pocket, withdrew a slim notebook. “Harvard Papers mention similar ash from a disused chapel in Oxford—white bones after exorcisms. They say it’s just calcium carbonate from burned marrow.” Quinn kept her eyes on the floor. “So why keep the ash? Ritual or no, the murderer needed it.” She stood, brushing dust from her pants. The compass was heavy in her pocket. She pulled it back out. The needle now
quivered
toward the far end of the platform, where the old signal box door sagged on rusted hinges. “And my bet is they came in here, over there, staged the scene, then left through that service tunnel, compass in hand to guide them to the next site.” Eva’s lips parted. “But if the compass works—points to a real rift—this could lead us to a breach in the city’s ley lines.” Quinn shook her head. “Or it points to another staged crime scene. The closer you get to a real rift, the stronger its pull. This thing would lock solid if it felt an opening. It’s too loose. Someone just wants us digging in the dark.” She dropped the compass back in her pocket. “Let’s check that sealed tunnel.” They strode toward the tunnel’s grated threshold. Weathered bricks lined its mouth. Quinn ran her glove over the mortar—fresh chips. “They forced it,” she said. “Look, mortar dust settled on the platform.” She
flicked
a finger at a smear of black residue. “They used a blowtorch or an acid
etch
. Clean break.” Eva knelt, adjusting her glasses. “And these scorch marks suggest a single-direction arc. Whoever did it came this way.” She traced a curved line with a trembling finger. “From the left.” Quinn studied the marks. “Left—from the signal box.” She moved back toward the door, tugged at the edge. It groaned. Beneath the door, a scrap of fabric poked through. She tugged it free. Dark wool, patterned. “This matches the uniform lining we found on the victim’s collar tear. Someone left a piece of their coat behind.” Harper joined them, blueprint printouts in hand. “Here—service hatch leads to ventilation shafts that connect to the disused platform three stops north. But that hatch should be welded shut.” Quinn glanced at Eva, whose eyes had gone wide. “The Market moves every full moon,” Quinn murmured. “Tonight’s the full moon. If someone wants to smuggle goods from the Veil Market, they open this tunnel—bring contraband in or out unnoticed.” She felt the old tug at her jawline, a memory of Morris’s last case: forbidden shipments, whispers of shadows. “And if the compass points to their next meet-up, they’re hiding something down there.” Eva stood, breath
dancing
in her lungs. “It’s a smuggling operation, not a séance. The body’s a message: don’t cross us.” She glanced at the chalk outline. “Ritual-style warning.” Quinn nodded, her shoulders unclenching for the first time. “Exactly. This is street justice with occult trappings.” She yanked her radio free. “Control, Quinn. I want a forensic team on that signal box, another on the platform. Then
secure that service
hatch. I’m going in with one uniform. Keep it tight. Let’s see what’s behind that door.” The tunnel’s darkness swallowed them as they dropped below the platform. Quinn
flicked
on her torch. The beam caught the compass peeking from her pocket. The needle spun once more, and then,
suddenly
, stilled—pointing down the narrow shaft. Harlow Quinn pressed a gloved hand to the brick, fingers brushing a smudged sigil. It was crude, misaligned. The sort of mark a smuggler might carve to reassure a trafficker, not a proper Shade artisan. Quinn exhaled, shouldering her jacket. “Let’s follow it. But no illusions—this compass is a blunt instrument in crooked hands. We go in looking for human killers, not spirits.” Eva gave her a small, determined nod. “Human killers with hellish methods.” Quinn cast a final glance at the platform above, where amber torchlight met darkness. “Time to show them we’re not afraid of ghosts—or the ones who wield them.” With that, she led the way into the tunnel, torch held high, mind already racing toward the next clue.
created_at
"2026-02-17T02:49:47.583702+00:00"
2/17/2026, 2:49:47 AM
model
"openai/o4-mini-2025-04-16"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
5129
moderation_latency
(null)
generation_time
18826
tokens_prompt
923
tokens_completion
2718
native_tokens_prompt
835
native_tokens_completion
2790
native_tokens_completion_images
0
native_tokens_reasoning
704
native_tokens_cached
0
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"completed"
usage
0.0131945
router
(null)
provider_responses
0
id
"resp_0ac3b394e3ca0541016993d73847988196bd86d59d062927a5"
status
200
is_byok
false
latency
97
endpoint_id
"bd121898-b27c-4e2c-bc92-278627465a54"
provider_name
"OpenAI"
model_permaslug
"openai/o4-mini-2025-04-16"
api_type
"completions"
id
"gen-1771296568-k1wkh5BGvDD2QBvGp5A2"
upstream_id
"resp_0ac3b394e3ca0541016993d73847988196bd86d59d062927a5"
total_cost
0.0131945
cache_discount
(null)
upstream_inference_cost
0
provider_name
"OpenAI"
Evaluation details
Result
Evaluator
Details
Meta Data
100.0000%
Adverbs in dialogue tags
Target: ≤10% dialogue tags with adverbs
totalTags
5
adverbTagCount
0
adverbTags
(empty)
93.3155%
AI-ism adverb frequency
Target: <1% AI-ism adverbs (57 tracked)
wordCount
1496
totalAiIsmAdverbs
1
found
0
adverb
"suddenly"
count
1
highlights
0
"suddenly"
100.0000%
AI-ism character names
Target: 0 AI-default names (17 tracked, −20% each)
codexExemptions
(empty)
found
(empty)
100.0000%
AI-ism location names
Target: 0 AI-default location names (33 tracked, −20% each)
codexExemptions
(empty)
found
(empty)
6.4171%
AI-ism word frequency
Target: <1% AI-ism words (140 tracked)
wordCount
1496
totalAiIsms
14
found
0
word
"flicked"
count
5
1
word
"quivered"
count
2
2
word
"fluttered"
count
1
3
word
"weight"
count
1
4
word
"dancing"
count
2
5
word
"echoes"
count
1
6
word
"etch"
count
1
7
word
"etched"
count
1
highlights
0
"flicked"
1
"quivered"
2
"fluttered"
3
"weight"
4
"dancing"
5
"echoes"
6
"etch"
7
"etched"
100.0000%
Cliche density
Target: ≤1 cliche(s) per 800-word window
totalCliches
0
maxInWindow
0
found
(empty)
highlights
(empty)
100.0000%
Emotion telling (show vs. tell)
Target: ≤3% sentences with emotion telling
emotionTells
0
narrationSentences
103
matches
(empty)
100.0000%
Filter word density
Target: ≤12% sentences with filter/hedge words
filterCount
1
hedgeCount
0
narrationSentences
103
filterMatches
0
"watch"
hedgeMatches
(empty)
100.0000%
Overuse of "that" (subordinate clause padding)
Target: ≤10% sentences with "that" clauses
thatCount
1
totalSentences
124
matches
0
"secure that service"
80.4248%
Paragraph length variance
Target: CV ≥0.5 for paragraph word counts
totalParagraphs
31
mean
47.45
std
20.47
cv
0.431
sampleLengths
0
71
1
71
2
70
3
40
4
46
5
41
6
19
7
39
8
28
9
65
10
30
11
42
12
45
13
62
14
19
15
90
16
37
17
88
18
23
19
58
20
52
21
31
22
56
23
29
24
73
25
29
26
57
27
73
28
28
29
12
30
47
100.0000%
Passive voice overuse
Target: ≤5% passive sentences
passiveCount
2
totalSentences
103
matches
0
"were flicked"
1
"was filed"
100.0000%
Past progressive (was/were + -ing) overuse
Target: ≤10% past progressive verbs
pastProgressiveCount
0
totalVerbs
176
matches
(empty)
100.0000%
Purple prose (modifier overload)
Target: <4% adverbs, <2% -ly adverbs, no adj stacking
wordCount
890
adjectiveStacks
0
stackExamples
(empty)
adverbCount
21
adverbRatio
0.02359550561797753
lyAdverbCount
2
lyAdverbRatio
0.0022471910112359553
100.0000%
Repeated phrase echo
Target: ≤20% sentences with echoes (window: 2)
totalSentences
124
echoCount
0
echoWords
(empty)
100.0000%
Sentence length variance
Target: CV ≥0.4 for sentence word counts
totalSentences
124
mean
11.86
std
9
cv
0.759
sampleLengths
0
20
1
14
2
18
3
19
4
14
5
6
6
18
7
11
8
22
9
4
10
24
11
9
12
11
13
14
14
8
15
5
16
27
17
8
18
5
19
8
20
8
21
12
22
7
23
6
24
4
25
9
26
15
27
4
28
7
29
2
30
4
31
11
32
4
33
8
34
17
35
2
36
12
37
8
38
20
39
40
40
11
41
14
42
6
43
24
44
1
45
3
46
6
47
14
48
6
49
9
68.2796%
Sentence opener variety
Target: ≥60% unique sentence openers
consecutiveRepeats
7
diversityRatio
0.4435483870967742
totalSentences
124
uniqueOpeners
55
25.0000%
Dialogue tag variety (said vs. fancy)
Target: ≤30% fancy dialogue tags
totalTags
5
fancyCount
3
fancyTags
0
"she nodded at the far end of the platform (nod)"
1
"Eva whispered (whisper)"
2
"Quinn murmured (murmur)"
86.6728%