Run: 2026-02-20-01KHY9KV2EPKSH9RS2ND343J4T

Run Details

Duration:
42.2s
Cost:
$0.003443
Provider:
Seed

Evaluation summary

Score Evaluator
Details Link
86.38% Matches word count (details)
0.37% Dialogue to Total Word Ratio (details)
43.3771% Overall

Model input

User/Human
You are a author of comedic heist books. Please write a 500 word scene that has 70% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.

Model response

AI
Molly slams a crumpled glittery party invitation on the diner booth’s sticky Formica, straw from her root beer dangling from one corner of her smirk. Bill runs a finger over the invitation’s sparkly edge, nose wrinkling at the flecks sticking to his skin. “You see the fine print? The birthday kid’s a tech billionaire’s nephew—presents aren’t just sticker books.” “I heard the Lego Millennium Falcon’s in there. The 7,541-piece one.” “Exactly. But the catch? The party’s got a ‘clown security detail.’” “Clowns? Not the honking-horn, squirting-flower kind?” “Worse—ex-circus bouncers in greasepaint. They can bench-press a pony.” “So what’s the play? Disguises?” “I was thinking magician’s assistant. You can be the magician—terrible one, so everyone’s distracted by your flubbed tricks.” “Terrible magician? I can’t even pull a rabbit from a hat without it biting me.” “Perfect. They’ll write you off as a hack. I’ll sneak to the present table while you’re fumbling a saw trick.” “What about the pinata? I heard it’s got a hidden camera.” “Already checked—its lens is covered in a layer of lollipop goo. Useless.” “And the toy chest? I saw a photo—locked with a plastic combination lock shaped like a dinosaur.” “Dinosaur lock? Kid’s obsessed with T-Rexes. Combination’s his birthday: 0317. Got it from the invitation’s back.” “Wait—what if a kid spots you?” “I’ll say I’m rearranging presents for the ‘grand reveal.’ Kids love grand reveals.” “What if the clown bouncer catches you?” “I’ve got a pocket full of whoopee cushions. Deploy one, they’ll all lose it laughing.” “Whoopee cushions? That’s your master plan?” “Works better than a taser for a kids’ party. Can’t risk scaring the kids—cops show up faster if there’s screaming.” “Fair. What about the escape?” “We’ll slip out through the backyard’s treehouse. I tied a mylar balloon rope to it last night—leads to the alley where the van’s parked.” “Mylar balloon? What if it pops?” “Indestructible. Got it from the dollar store. No feather trails this time—learned my lesson from the rubber chicken diamond heist.” “Don’t remind me. That chicken left a trail straight to the van.” “This time’s different. No biting rabbits, no feather messes. Just flubbed tricks and whoopee cushions.” “Wait—what if I flub the saw trick so bad I get kicked out before you can sneak the presents?” “Then I’ll blame you and take the Lego set for myself. Deal?” “Deal. But if a clown bouncer benches me into a cake, I’m hiding in the treehouse until dawn.” “Cake detour’s acceptable—just don’t get frosting on the Lego set. Kid’s billionaire uncle will notice if it’s sticky.” Molly taps her boot against the booth’s wobbly leg, pocket crinkling with whoopee cushions as she grabs a fry from Bill’s plate. Bill sighs, folding the invitation into a lopsided paper airplane and flicking it at a nearby salt shaker, already dreading the rabbit he’d have to borrow for his flubbed trick.
created_at"2026-02-20T19:49:48.249Z" 2/20/2026, 7:49:48 PM
model"bytedance-seed/seed-1.6-20250625"
app_id182717
external_user(null)
streamedtrue
cancelledfalse
latency1115
moderation_latency(null)
generation_time42115
tokens_prompt77
tokens_completion1795
native_tokens_prompt147
native_tokens_completion1703
native_tokens_completion_images(null)
native_tokens_reasoning941
native_tokens_cached0
num_media_prompt(null)
num_input_audio_prompt(null)
num_media_completion0
num_search_results(null)
origin"https://ncbench.com/"
is_byokfalse
finish_reason"stop"
native_finish_reason"stop"
usage0.00344275
router(null)
provider_responses
0
endpoint_id"d85c3862-4e0d-4f8d-955b-babc62f98124"
id"02177161698838245c07e4ff35b1f55227084b3d76e9a8084b881"
is_byokfalse
latency1115
model_permaslug"bytedance-seed/seed-1.6-20250625"
provider_name"Seed"
status200
api_type"completions"
id"gen-1771616988-3lg9reK0H6NQlZPyd9h6"
upstream_id"02177161698838245c07e4ff35b1f55227084b3d76e9a8084b881"
total_cost0.00344275
cache_discount(null)
upstream_inference_cost0
provider_name"Seed"

Evaluation details

Result Evaluator Details Meta Data
86.38% Matches word count n/a
neededCleanfalse
words489
0.37% Dialogue to Total Word Ratio
Ratio: 80.57%, Deviation: 10.57%
neededCleanfalse
wordsTotal489
wordsDialogue394
43.3771%