Run: 2026-02-24-01KJ8VZZGE1490ET5Z7ASNQRM8

Run Details

Model:
Aion 2.0
Duration:
28.3s
Cost:
$0.001862
Provider:
AionLabs

Evaluation summary

Score Evaluator
Details Link
95.99% Matches word count (details)
0.00% Dialogue to Total Word Ratio (details)
47.9934% Overall

Model input

User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 50% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.

Model response

AI
The back of the van smelled of old fast food and desperation. Molly, her face illuminated by the glow of a laptop balanced on a milk crate, pointed a laser pen at a schematic taped to the wall. Bill sat on an upturned bucket, nervously chewing on a licorice whip. “This is the target,” Molly said, her voice low and serious. “Six-year-old Timothy Henderson’s birthday party. Saturday, 2 PM to 5 PM. The presents will be stacked on a table here, adjacent to the cake station.” Bill squinted. “What’s the security like?” “Tight. You’ve got two primary sentries: the mother, Karen, and the father, Dave. Karen is hyper-vigilant, likely armed with wet wipes. Dave will be manning the grill, a distraction we can use. Secondary threats include a possibly observant grandmother, and a hired entertainer named ‘Jingles the Clown.’ Intel suggests he’s mostly focused on balloon animals.” “A clown,” Bill groaned. “I hate clowns. They’re unpredictable.” “Which is why we go in during the piñata frenzy. Maximum chaos. Our window is three minutes, starting when the first blindfolded child swings. You create a diversion by ‘accidentally’ setting off the car alarm on the silver Prius. I’ll move in, empty the table, and exit through the side gate.” Bill nodded, then frowned. “What’s the loot profile?” Molly clicked to a new image. “Mixed bag. Confirmed sightings of a ‘Super Spy Kid’ watch, a junior chemistry set, and several gift cards of undetermined value. The big score is rumored to be a ‘Zoomer Turbo X’ remote-controlled car. High-end, limited edition.” “A toy car? Molly, we usually go for jewels.” “Jewels are guarded by lasers and ex-SAS. This is guarded by a man wearing a ‘Kiss the Chef’ apron and a clown making a poodle out of latex. It’s a soft target, Bill. A cakewalk.” “I don’t like cake,” Bill muttered. “What about extraction?” “Standard. We load the goods into the party bags we’ve modified with false bottoms. We walk out looking like disgruntled parents. The van is parked two blocks away, near the dog park.” Bill shifted on his bucket. “It feels… wrong. Stealing from kids.” Molly fixed him with a steely gaze. “Every heist feels wrong until you’re spending the money. These are not ‘kids,’ Bill. They are unwitting asset holders. And their assets are poorly secured. We’re not monsters. We’ll leave the cake.” “What about the piñata candy?” “That’s fair game if you can grab it during the exfil. But stay focused on the primary objective.” She closed the laptop. “We go in soft, quiet, and fast. We are professionals. This is what we do.” Bill sighed, looking at the schematic of the suburban backyard. “I just hope the clown isn’t one of those ones that makes sarcastic comments.” “If he does,” Molly said, slapping his shoulder, “we’ll steal his oversized shoes, too. Now, let’s run through the Prius alarm sequence again. You’re still hitting the panic button instead of the unlock.”
created_at"2026-02-24T22:23:24.693Z" 2/24/2026, 10:23:24 PM
model"aion-labs/aion-2.0-20260223"
app_id182717
external_user(null)
streamedtrue
cancelledfalse
latency2487
moderation_latency(null)
generation_time28228
tokens_prompt77
tokens_completion1216
native_tokens_prompt170
native_tokens_completion1079
native_tokens_completion_images(null)
native_tokens_reasoning401
native_tokens_cached0
num_media_prompt(null)
num_input_audio_prompt(null)
num_media_completion0
num_search_results(null)
origin"https://ncbench.com/"
is_byokfalse
finish_reason"stop"
native_finish_reason"stop"
usage0.0018624
router(null)
provider_responses
0
endpoint_id"1e93b0bb-eb0f-49bc-8fda-947105413b86"
id"402e7f63-ab7d-45b5-85e5-757ecc4d01ad"
is_byokfalse
latency221
model_permaslug"aion-labs/aion-2.0-20260223"
provider_name"AionLabs"
status200
api_type"completions"
id"gen-1771971804-9IogKNqQT9aubiia9JCc"
upstream_id"402e7f63-ab7d-45b5-85e5-757ecc4d01ad"
total_cost0.0018624
cache_discount(null)
upstream_inference_cost0
provider_name"AionLabs"

Evaluation details

Result Evaluator Details Meta Data
95.99% Matches word count n/a
neededCleanfalse
words492
0.00% Dialogue to Total Word Ratio
Ratio: 78.99%, Deviation: 28.99%
neededCleanfalse
wordsTotal495
wordsDialogue391
47.9934%