NC Bench
Overview
About
Tests
Categories
Models
Vendors
Compare
Run: 2026-03-11-01KKFJK2CBG72HQB3HA1F2SHCB
Run Details
Model:
Stealth: Healer Alpha
Test:
Dialogue tags
Scenario:
Write 500 words with 50% dialogue
Duration:
16.6s
Cost:
$0.000000
Provider:
Stealth
Evaluation summary
Score
Evaluator
Details Link
0.00%
Matches word count
(details)
0.00%
Dialogue to Total Word Ratio
(details)
0.0000%
Overall
Model input
User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.
Model response
Messages
Meta
AI
The cramped back room of Molly and Bill’s rented basement smelled of stale coffee and desperation. A single bulb dangled overhead, casting sharp shadows across the walls, which were plastered with children’s birthday party flyers swiped from community boards. Molly stood before a makeshift corkboard, her finger tracing a crudely drawn map of the “Sunshine Play & Cake” venue. Bill sat slumped in a creaking chair, squinting at the diagram as if it were written in ancient runes. “Okay, listen,” Molly began, her voice a low, conspiratorial whisper. “The target is the birthday bash for one Timmy Foster, age six, at the community center tomorrow at two p.m. The haul? All the presents. Wrapped, likely containing toys of significant resale value.” Bill leaned forward, knocking over an empty soda can. “Whoa, presents? Like, for kids? Isn’t that kind of low, Molly? Stealing from little Timmy? He probably saved up his allowance for a whole week to buy those.” Molly sighed, pinning a photo of a smiling clown to the board. “First, Timmy didn’t buy them. His parents and relatives did. Second, our financial situation is precarious. Third, this is a heist. It’s what we do. The moral calculus of a six-year-old’s birthday party is not our concern.” “Right, right. Moral calculus.” Bill nodded, scratching his chin. “So, the security? It’s a party, right? Balloons, maybe a piñata? That’s our biggest threat? A blindfolded kid swinging a stick?” “The security,” Molly said, tapping a diagram of the center’s layout, “includes two harried parents, a hired teenager in a clown suit named ‘Giggles,’ and approximately fifteen over-sugared children. The presents are staged on a table near the entrance. We need to extract them without triggering the alarm, which in this case, is likely the collective scream of the partygoers.” “How do we get in?” Bill asked, eyeing a picture of a cartoon bear mascot. “We can’t just walk in.” “Precisely. I’ve procured uniforms.” Molly reached under the table and pulled out two sets of bright yellow overalls with a friendly badger emblem stitched on the chest. “We pose as the new entertainers. ‘Buzzy and Bubbles, the Balloon Animals Duo.’ You, Bill, are Buzzy. I am Bubbles. Our mission: create a diversion.” Bill held up the overalls, his face wrinkling with doubt. “These look like they were made for toddlers. And I’m Buzzy? Why do I have to be Buzzy? Buzzy sounds like a bee who’s lost his sting. I want to be Bubbles. Bubbles has more… gravitas.” “Because you’re the one who gets stage fright and might accidentally squeak,” Molly replied flatly. “The plan is simple. At the height of the party, I’ll start making balloon animals—poorly, on purpose. This will attract a swarm of kids. You, meanwhile, will approach the present table under the guise of ‘clearing away trash.’ The wrapping paper we’ve collected from other bins will serve as cover. We stuff the gifts into our oversized clown pants.” “Clown pants? We don’t have clown pants. We have overalls.” “Improvisation, Bill. The overalls have deep pockets. We’ll use the gift bags themselves as secondary containers. It’s a double-layer concealment strategy.” Bill’s eyes widened. “That’s… actually not terrible. But what if Giggles the clown notices? He’s supposed to be professional, right? A rival in the entertainment field?” Molly smirked. “Giggles is seventeen, perpetually on his phone, and more interested in a TikTok trend than policing the present table. He’s a non-factor. Our biggest risk is Timmy himself. He has a keen eye for patterns and might recognize that Buzzy and Bubbles are suspiciously focused on the loot.” “So we distract Timmy too? With a balloon dog?” “Exactly. I’ll craft him a ‘super special birthday poodle’ that will consume his attention for approximately four minutes. That’s our window.” Bill stood up, the chair groaning in protest. “Alright. I’m in. Buzzy and Bubbles. It’s so ridiculous, it might just work.” “Comedy is the best cover for crime,” Molly said, a rare smile touching her lips. “Now, let’s practice our balloon-twisting. And for heaven’s sake, don’t pop it this time.” As Bill fumbled with a long, skinny balloon, it let out a sharp hiss and exploded in his hands. “Oops.” Molly just shook her head, but the corner of her mouth twitched upward. Tomorrow, the Sunshine Play & Cake center would never know what hit it—or rather, what Buzzy and Bubbles smoothly pilfered while everyone was laughing.
created_at
"2026-03-11T23:09:27.319Z"
3/11/2026, 11:09:27 PM
model
"openrouter/healer-alpha"
app_id
182717
external_user
(null)
streamed
true
cancelled
false
latency
763
moderation_latency
(null)
generation_time
16505
tokens_prompt
77
tokens_completion
1991
native_tokens_prompt
2147
native_tokens_completion
1682
native_tokens_completion_images
(null)
native_tokens_reasoning
775
native_tokens_cached
2092
num_media_prompt
(null)
num_input_audio_prompt
(null)
num_media_completion
0
num_search_results
(null)
origin
"https://ncbench.com/"
is_byok
false
finish_reason
"stop"
native_finish_reason
"stop"
usage
0
router
(null)
provider_responses
(null)
user_agent
"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer
(null)
api_type
"completions"
id
"gen-1773270567-3BcYsrKtszV5UYbT86Nm"
upstream_id
"gen-1773270567-3BcYsrKtszV5UYbT86Nm"
total_cost
0
cache_discount
(null)
upstream_inference_cost
0
provider_name
"Stealth"
Evaluation details
Result
Evaluator
Details
Meta Data
0.00%
Matches word count
n/a
neededClean
false
words
735
0.00%
Dialogue to Total Word Ratio
Ratio: 67.48%, Deviation: 17.48%
neededClean
false
wordsTotal
741
wordsDialogue
500
0.0000%