Run: 2026-03-11-01KKFDMREKDFDXJEKT8P9EYT37

Run Details

Duration:
44.9s
Cost:
$0.003717
Provider:
Seed

Evaluation summary

Score Evaluator
Details Link
35.00% Matches word count (details)
93.06% Dialogue to Total Word Ratio (details)
64.0314% Overall

Model input

User/Human
You are a author of comedic heist books. Please write a 200 word scene that has 90% words of dialogue between two thieves named Molly and Bill who are planning to break into a children's party to steal all the presents. Only include the prose of the scene without any markdown or commentary.

Model response

AI
Molly and Bill hunkered behind a thorny hawthorn hedge at the edge of a suburban lawn, unicorn-themed party streamers flapping three feet from their heads, a toddler’s birthday playlist thumping over the fence. “Bill, if you so much as hum the first line of Baby Shark one more time, I’m leaving you here to get tackled by a pack of seven-year-olds hopped up on pixy stix.” “C’mon, it relaxes me! What’s the big emergency? I thought the plan was solid.” “The plan required you to order a 20-topping pizza to drop off at the front door to lure all the parents away. Why’s that greasy box in your backpack?” “The pizza place put pineapple on it! No self-respecting suburban parent would touch a pineapple pizza at their own kid’s party, what was I gonna do, waste $40?” “Your backup plan is to steal the pizza? We’re here to steal birthday presents, remember? Your kid’s Lego set isn’t gonna build itself.” “Hey, I saw a $50 Roblox gift card through the picture window, and that big Kool-Aid punch bowl? I could drink that whole thing. Also, can we grab the stuffed unicorn on the picnic table? My cat needs a new pillow.” “Your cat doesn’t need a unicorn, Bill—wait, is that the birthday clown heading this way? Run.”
created_at"2026-03-11T21:42:59.807Z" 3/11/2026, 9:42:59 PM
model"bytedance-seed/seed-2.0-lite-20260309"
app_id182717
external_user(null)
streamedtrue
cancelledfalse
latency763
moderation_latency(null)
generation_time44836
tokens_prompt77
tokens_completion1885
native_tokens_prompt115
native_tokens_completion1844
native_tokens_completion_images(null)
native_tokens_reasoning1547
native_tokens_cached0
num_media_prompt(null)
num_input_audio_prompt(null)
num_media_completion0
num_search_results(null)
origin"https://ncbench.com/"
is_byokfalse
finish_reason"stop"
native_finish_reason"stop"
usage0.00371675
router(null)
provider_responses
0
endpoint_id"bf1a3473-7a0a-46d3-aaa5-adb8f876a697"
id"0217732653802443a96cb89bb0639877068f7507e34a86a45ac58"
is_byokfalse
latency763
model_permaslug"bytedance-seed/seed-2.0-lite-20260309"
provider_name"Seed"
status200
user_agent"langchainjs-openai/1.0.0 ((node/v24.14.0; linux; x64))"
http_referer(null)
api_type"completions"
id"gen-1773265379-uzKh7QBuAn3OxzG3wecB"
upstream_id"0217732653802443a96cb89bb0639877068f7507e34a86a45ac58"
total_cost0.00371675
cache_discount(null)
upstream_inference_cost0
provider_name"Seed"

Evaluation details

Result Evaluator Details Meta Data
35.00% Matches word count n/a
neededCleanfalse
words218
93.06% Dialogue to Total Word Ratio
Ratio: 84.82%, Deviation: 5.18%
neededCleanfalse
wordsTotal224
wordsDialogue190
64.0314%