Run: 2026-03-11-01KKFG0ANZ2XM8BXBVP0JGTGFA

Run Details

Duration:
8.2s
Cost:
$0.000000
Provider:
Stealth

Evaluation summary

Score Evaluator
Details Link
0.00% Matches word count (details)
95.68% Dialogue to Total Word Ratio (details)
47.8415% Overall

Model input

User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.

Model response

AI
Molly pressed her nose against the cold glass of the patio door. "Okay. The loot is concentrated in three zones: the main pile by the bouncy castle, the secondary cache near the juice fountain, and the overflow by the clown." Bill, adjusting his tiny ski mask, peered over her shoulder. "The clown. Always the clown. Creepy." "Focus. Security is minimal. One parent is passed out on a folding chair. The other two are engaged in a deeply philosophical debate about organic versus conventional fruit snacks." "So, a frontal assault?" "Don't be an idiot. We go in through the dog flap. It's a Dalmatian-sized flap, Bill. You can make it." "I have a very elegant frame," he protested. Molly unrolled a crinkled napkin with a crayon schematic. "Extraction is key. I've identified a pneumatic tube system—it's the ball pit. We load the packages into the blue chute, pull the lever, and the contents are delivered directly to our van's sunroof." Bill stared. "You want to steal presents... via ball pit?" "It's inspired. Now, about the cake as a diversionary explosive..." "Wait, what? No. Absolutely not. That's a buttercream line I won't cross." "Fine. We'll just use the karaoke machine. No parent can resist the call of 'Baby Shark' at maximum volume. It'll buy us twelve minutes." "Twelve minutes to raid a party for a giant teddy bear and three board games with missing pieces." "The *principle*, Bill. The *principle*."
created_at"2026-03-11T22:24:16.076Z" 3/11/2026, 10:24:16 PM
model"openrouter/healer-alpha"
app_id182717
external_user(null)
streamedtrue
cancelledfalse
latency3059
moderation_latency(null)
generation_time8039
tokens_prompt77
tokens_completion601
native_tokens_prompt2147
native_tokens_completion515
native_tokens_completion_images(null)
native_tokens_reasoning224
native_tokens_cached4
num_media_prompt(null)
num_input_audio_prompt(null)
num_media_completion0
num_search_results(null)
origin"https://ncbench.com/"
is_byokfalse
finish_reason"stop"
native_finish_reason"stop"
usage0
router(null)
provider_responses(null)
user_agent"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer(null)
api_type"completions"
id"gen-1773267856-s6Vkr7PvIkD3IuRE9Blk"
upstream_id"gen-1773267856-s6Vkr7PvIkD3IuRE9Blk"
total_cost0
cache_discount(null)
upstream_inference_cost0
provider_name"Stealth"

Evaluation details

Result Evaluator Details Meta Data
0.00% Matches word count n/a
neededCleanfalse
words239
95.68% Dialogue to Total Word Ratio
Ratio: 85.42%, Deviation: 4.58%
neededCleanfalse
wordsTotal240
wordsDialogue205
47.8415%