Two issues surfaced on HOWOGE and similar sites:
1. Tiny icons/1x1 tracking pixels leaked through (e.g. image #5, 1.8 KB).
Added MIN_IMAGE_BYTES = 15_000 and MIN_IMAGE_DIMENSION = 400 px on the
short side; files below either threshold are dropped before saving.
Pillow already gives us the dims as part of the phash pass, so the
check is free.
2. Listings whose image URLs are opaque CDN hashes
(.../fileadmin/_processed_/2/3/xcsm_<hash>.webp.pagespeed.ic.<hash>.webp)
caused the LLM URL picker to reject every candidate, yielding 0 images
for legit flats. Fixes: (a) prompt now explicitly instructs Haiku to
keep same-host /fileadmin/_processed_/ style URLs even when the filename
is illegible, (b) if the model still returns an empty set we fall back
to the unfiltered Playwright candidates, trusting the pre-filter instead
of erasing the gallery.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>