Reddit locks down its public data in new content policy, says use now requires a contract

Abaixo de Cão@lemm.ee · 1 year ago

Reddit locks down its public data in new content policy, says use now requires a contract

rockSlayer@lemmy.world · 1 year ago

Well that’s part of the thing. Web scraping doesn’t get covered by policies. Like, they could ban your ip or any accounts you have, but web scraping itself will always be acceptable. It’s why projects like NewPipe and Invidious don’t care about YouTube cease and desist letters.

werefreeatlast@lemmy.world · 1 year ago

Oops look like this community hasn’t been reviewed. Login if you still want to see the content.

rockSlayer@lemmy.world · 1 year ago

Yea, I’ve seen those pop-ups when trying to find something out. It sucks but isn’t a significant barrier to web scraping

AeroLemming@lemm.ee · edit-2 8 months ago

deleted by creator

werefreeatlast@lemmy.world · 1 year ago

I tried that on my desktop. So long as you are not actually logged in you cannot see the communities that are too small for a review or too adult after a review.

AeroLemming@lemm.ee · edit-2 8 months ago

deleted by creator

AeroLemming@lemm.ee · edit-2 8 months ago

deleted by creator

folkrav · 1 year ago

Parsing absolutely comes with a lot more overhead. Especially since many websites integrate a lot of JS interactivity nowadays, you oftentimes don’t get the full contents you’re looking for straight out of the HTML you’re getting out of your HTTP request, depending on the site.

AeroLemming@lemm.ee · edit-2 8 months ago

deleted by creator

krippix@feddit.de · 1 year ago

In what way?

HTML definitely provides more overhead than json if you only care about the data.

AeroLemming@lemm.ee · edit-2 8 months ago

deleted by creator

Reddit locks down its public data in new content policy, says use now requires a contract

Reddit locks down its public data in new content policy, says use now requires a contract

Reddit locks down its public data in new content policy, says use now requires a contract | TechCrunch