Link posts from thetyee.ca appear to be previewing the Cloudflare captcha challenge “Attention Required! | Cloudflare” prompt instead of actual content.
What it looks like (two examples - this seems to be a consistent problem for thetyee):
(from https://lemmy.ca/post/39053782)
(from https://lemmy.ca/post/39058782)
What it should look like (a different post):
My guess is that the rich link preview is generated in Lemmy’s backend, and Cloudflare thinks that the IP address of lemmy.ca’s host is full of bots.
No educated guess on the solution, though, but I’d guess that other Lemmy admins have seen sort of thing too.
That means we’d keep running into stuff like this long term as the server/s IP/s get flagged by various CDNs and what not. Oh well. Not much we can do unless one of us gets the work done and shepherds a PR into Lemmy mainline.
Lemmy mainline could also skip rich link preview generation if it returns a non-2xx status. Have a few retries if link preview generation fails, and omit the preview entirely if retries are exhausted. I think the
Attention Required! | Cloudflare
prompt is associated with a403 Forbidden
status code.And also generate some logs of what addresses are being refused, so the Lemmy admins can reach out to the content owners and get their servers unblocked, maybe.
I think that might introduce concerns around a client presenting modified info vs what everyone else would see. I’d imagine Facebook / discord and everyone else that generates link previews, struggle with this problem.
Yes but people can put whatever info in the title, thumbnail, etc. so I imagine the previews are more about convenience than accuracy. I think accuracy is handled by the voting system first and moderation second. There are likely corner cases I’m not considering so yes, there likely would be concerns about that either way. I’m just thinking about it from a low-cost compromise perspective.
Yeah for sure they do and they have large pools of IPs that cost money we probably don’t want to spend. They also likely can directly talk to CDNs and IP list providers to allowlist their IPs. :D I used to work for a bit at the VPN side of a security company and the struggle for clean IPs was real. AWS solved that but it was very expensive compared to metal in smaller DCs.