• 0 Posts
  • 24 Comments
Joined 9 months ago
cake
Cake day: March 28th, 2025

help-circle













  • That’s true. Scrapping is a gold mine for the people that don’t know. I worked for a place which crawls the internet and beyond (fetches some internal dumps we pay for). There is no chance a zip bomb would crash the workers as there are strict timeouts and smell tests (even if a does it will crash an ECS task at worst and we will be alerted to fix that within a short time). We were as honest as it gets though, following GDPR, honoring the robots file, no spiders or scanners allowed, only home page to extract some insights.

    I am aware of some big name EU non-software companies very interested in keeping an eye on some key things that are only possible with scraping.


  • You are not wrong. But there are things you can do to make a point. Make Reddit as a 2nd class citizen and drive people to lemmy, mastodon and the others. Like add posts with no comments, just relay bot, … Make it clear.

    Same with GitHub, it’s mirror to my Gitea instance. You can see stuff but you have to move somewhere else to contribute and report issues. Not a terrible thing to use these proprietary services and yet make them 2nd class citizens.