FOSS infrastructure is under attack by AI companies

WorkingLemmy@lemmy.world · 29 days ago

FOSS infrastructure is under attack by AI companies

jagged_circle@feddit.nl · 29 days ago

Its absolutely sustainable. Just cache it. Done.

LiveLM@lemmy.zip · 29 days ago

I’m sure that if it was that simple people would be doing it already…

nutomic@lemmy.ml · 28 days ago

Cache size is limited and can usually only hold a limited number of most recently viewed pages. But these bots go through every single page on the website, even old ones that are never viewed by users. As they only send one request per page, caching doesnt really help.

jagged_circle@feddit.nl · 28 days ago

Cache size is definitely not an issue, especially for these companies using cloudflare

nutomic@lemmy.ml · 28 days ago

It is an issue for the open source projects discussed in the article.

Strawberry@lemmy.blahaj.zone · 28 days ago

The bots scrape costly endpoints like the entire edit histories of every page on a wiki. You can’t always just cache every possible generated page at the same time.

jagged_circle@feddit.nl · edit-2 27 days ago

Of course you can. This is why people use CDNs.

Put the entire site on a CDN with a cache of 24 hours for unauthenticated users.