NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-11-01KKFC6FHA8SH9C56H7B300211
Run Details
Model:
Stealth: Hunter Alpha
Test:
Dialogue tags
Scenario:
Write 500 words with 30% dialogue
Duration:
23.1s
Cost:
$0.000000
Provider:
Stealth
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 30% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The blueprint of the "Rainbow Castle Fun Zone" lay spread across the card table, held down at each corner by a half-eaten cheese puff. Molly, her nose inches from the paper, traced a path with a glittery pen she'd stolen from a stationary store last week. Bill was by the window, practicing his "non-threatening party clown" smile in the reflection of a toaster. "The objective is clear," Molly said, not looking up. "Approximately forty-seven gaudily-wrapped presents, ranging from board games to battery-powered noise-makers. Our extraction window is short. Cake-cutting begins at 15:00, providing maximal chaos." Bill turned, his smile dropping into a look of genuine concern. "Forty-seven? That's a lot of legos, Molly. My van's suspension is still recovering from the pet store heist. All those hamster wheels had a deceptive heft." "Focus, Bill. We're not stealing the hamsters. This is a targeted acquisition of pre-looted goods. The children have already done the hard work of identifying the valuable loot. We just need to… facilitate a transfer." "Right. A transfer." Bill wandered over, picking up a cheese puff and examining it like a jewel. "So, what's my role? Am I the distraction? I could be 'Wobbles the Wonder Clown.' I've been working on my patter. Want to hear?" "God, no," Molly said, finally looking up. Her eyes were bright with tactical fervor. "Your role is logistical support and interior acquisition. You will enter via the rear service corridor, disguised as a health inspector." "A health inspector? At a kids' party?" "The perfect cover. No one questions authority. You will have a clipboard. You will look sternly at the bouncy castle's air pressure. While the adults are preoccupied with imagined violations, you will begin consolidating the presents from the backyard play area into the pre-positioned laundry bins." "Laundry bins. Classy." "Meanwhile," she continued, ignoring him, "I will be operating as 'Brenda, the surprisingly fun aunt.' My job is to create a diversion during the critical cake phase. A well-timed, 'Ooh, is that a clown over by the fence?' should draw the children, and their parents, toward the front of the property. That's your cue to move the bins to the alley." Bill squinted at the blueprint. "What about the big one? The one shaped like a rocket ship. That's gotta be the main score." Molly nodded solemnly. "The 'Galactic Explorer Playset.' Intelligence suggests it contains over 200 pieces. That is our crown jewel. You will handle its extraction personally. Under no circumstances are you to shake it." "Noted. No shaking." Bill paused. "What if I get cornered? What's my exit line?" "You are a health inspector, Bill. You don't make small talk. You simply nod curtly and say, 'I'll be filing my report.' Then you walk, briskly, to the alley. The getaway driver—" "—is my cousin Lenny, in the donut truck. Got it." "Lenny is to remain with the engine running. The donut smell will also help mask the scent of plastic and fresh cardboard, should any patrol cars get nosy." Bill finally smiled, a real one this time. "Molly, this is either genius or the dumbest thing we've ever done." "Those categories are not mutually exclusive, Bill. Now, let's go over the wrapping paper. We need to avoid the noisy kind." She held up a sample, crinkling it experimentally. The sound was deafening in the small room. "See? Total operational failure." "Right," Bill said, grabbing another cheese puff. "Quiet paper. Loud health inspector. Got it. When do we start?" Molly checked her watch, a stolen item with a cracked face. "The party commences in two hours. We roll in thirty minutes. Try not to eat all the props, Bill. We may need to bribe a clown."
created_at
"2026-03-11T21:17:43.351Z"
3/11/2026, 9:17:43 PM
model
"openrouter/hunter-alpha"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
2448
moderation_latency
(null)
generation_time
22918
tokens_prompt
77
tokens_completion
1190
native_tokens_prompt
2111
native_tokens_completion
1056
native_tokens_completion_images
(null)
native_tokens_reasoning
228
native_tokens_cached
1984
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0
router
(null)
provider_responses
(null)
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773263863-Q76MkHpn5sX8vCuN1aRz"
upstream_id
"gen-1773263863-Q76MkHpn5sX8vCuN1aRz"
total_cost
0
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Stealth"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
617
0.00%
Dialogue to Total Word Ratio
Ratio: 73.80%, Deviation: 43.80%
neededClean
false
wordsTotal
626
wordsDialogue
462
0.0000%