As I was browsing lemmy and the fediverse at large, this question kept popping into my head.

Since multimedia files have a much bigger footprint than raw text, it made me feel worried since as time goes, massive resources will be needed to keep up with the big data coming in.

I do wonder if the instances have taken the route of the cloud and just decided to put all of it in something like AWS S3? Or maybe they use self hosted storage with something like minio for object storage?

  • Trifictional
    link
    fedilink
    English
    arrow-up
    28
    ·
    1 year ago

    I think this could be a ticking DOS time bomb.

    Someone manages to spam upload massive files to the largest Lemmy instances could wipe out a ton of smaller ones.

    Not to mention scalability wise this seems like a nightmare… eventually the largest Lemmy instances will have petabytes of media data with 100s of gbs coming in per day, giving other instances no chance to sync with them.

    I think the system architecture needs a significant review. This won’t scale.

    • Booty@lemmy.world
      link
      fedilink
      English
      arrow-up
      5
      ·
      1 year ago

      I agree. It’s also a tremendous waste of resources. I’m all for redundancy (like CDNs), but this seems incredibly poorly thought out. If Lemmy (as a whole) every scales to the size of other social media, the space requirements will start to become unreasonable.

      Why wouldn’t something like symlinks be implemented? Not saying specifically use symlinks, but there has to be a similar, better way.

      • laenurd@lemmy.lemist.de
        link
        fedilink
        English
        arrow-up
        6
        ·
        edit-2
        1 year ago

        The obvious way would be to just not cache content locally and always link to the source instance. While this would concentrate the strain immensely, it would also greatly decrease the storage space used by all other instances.

        There might also be other viable alternatives such as using a CDN and having it selectively cache content which is requested often etc.

        ~~As of now, Lemmy does not support either, though. ~~

        Edit: I want to clarify that I was partially wrong - Lemmy only locally caches content which is hosted on outside sites. It does (should?) not cache content that was directly uploaded to a Lemmy instance and just embeds the source media.

    • laenurd@lemmy.lemist.de
      link
      fedilink
      English
      arrow-up
      3
      ·
      1 year ago

      Agree. If I’m not mistaken, you can only disable the caching of sensitive (NSFW) content on your instance by disabling NSFW in general. This doesn’t go for SFW content though.

      It shouldn’t be very hard to do this for all content though, if I find the time I might look into implementing this.

    • AFK BRB Chocolate@lemmy.world
      link
      fedilink
      English
      arrow-up
      3
      ·
      1 year ago

      I feel like the developers should spend some time adding features to reduce malicious activity. They could provide settings to the admins to limit the number of things one user can do in a day, like number of images, total size of images, number of communities created, etc. Sure, someone could create multiple accounts, but it would still make it harder to attack Lemmy.

    • ndguardian@lemmy.studio
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      I mentioned something akin to this possibility a couple days ago, but was told this likely wasn’t the case. I’ll have to see if I can dig up the argument for that.