A few different options with how to host the archives.

Here’s what /r/datahoarder is doing with redarc

We could import it here, put it in a seperate community on this server, host it with redarc on a subdomain, it’s pretty much whatever.

I’ll put a survey up once I finish that server again for a vote, thought a discussion would be good to have prior to that going up.

Thoughts??

  • possibly a cat@lemmy.ml
    link
    fedilink
    English
    arrow-up
    2
    ·
    1 year ago

    So you would put it under a different, archival community? That doesn’t sound like a bad idea. We don’t know that it will last on reddit, and it opens up the content to be accessed outside of reddit. It could draw users to FC and to lemmy.

    On the other hand, starting fresh can be nice. The archive is inseparable from reddit and their influence. One example: All of the posts in the archive abide by reddit’s admin rules, and it may not reflect the opinions of the mods/maintainers/community. So this could also be an opportunity to hit reset.

    I think I lean toward uploading and hosting it.

    • ProfessionalHandJob@lemmy.beyondcombustion.netOP
      link
      fedilink
      English
      arrow-up
      1
      ·
      edit-2
      1 year ago

      Either a separate archive community or just right into vaporents on BeyondCombustion + a bot to scrape new posts from the Reddit RSS feed and repost them here (but under a bot account).

      I think it would be more engaging for lemmy at first to have posts pulled in at the beginning at least, until conversation takes place more naturally

      Absolutely going to host it 💯. Just trying to decide exactly what format…. If it works well I’d be willing to do the same for other subreddits, I’ll be documenting the steps but having PostgreSQL access on the lemmy server is a non starter for most subreddits/people/mods; unless they have their own infrastructure too.

      • possibly a cat@lemmy.ml
        link
        fedilink
        English
        arrow-up
        2
        ·
        1 year ago

        I like the idea of a separate community but at the same time there is definitely value in the continuity of transitioning the sub (I assume reddit will kill it sooner or later tbh).

        I think I get an idea of the scope of the task. I agree it’s probably too intensive for most communities, but I’m sure others would be interested. What’s involved with getting the archive? Is it scraped, or something you can download as a mod? I can think of a community or two that might appreciate a new home away from their mods…

        • ProfessionalHandJob@lemmy.beyondcombustion.netOP
          link
          fedilink
          English
          arrow-up
          1
          ·
          edit-2
          1 year ago

          Also, if there are communities out there that you’re interested in helping do this they have to either

          1. have their own server

          or

          1. get direct access to the PostgreSQL database on whatever new server they want to setup a community in.

          I’m sure there are other ways to import it, like scripting something that literally re-posted everything through API calls to Lemmy or ugh clicking through the web GUI lmao. Without that database access I’d consider it too much work/hassle to be practical.

        • ProfessionalHandJob@lemmy.beyondcombustion.netOP
          link
          fedilink
          English
          arrow-up
          1
          ·
          edit-2
          1 year ago

          Agree they will at some point nuke that subreddit.

          Pretty much everything has been archived that could have been, since the API goes dark in ~2 days.

          There’s details of what apps were used, where to download the archives directly, links to torrents and such on a post from /r/DataHoarder which I’m collecting links/text/guides from over in our Gitea instance as well as importing projects used for this effort.

          So far, there’s like ~5-6TB of archives I’ve downloaded through the links in that post, others on /r/DataHoarder, the-eye.eu, etc.

          They go back allll the way to 2005… It’s just text though, no media. Unless the media was a link to Imgur or YouTube or something, then the links are in the posts.

          There’s a couple bots/scripts that will repost new stuff moving forward from RSS feeds.

          To inject directly from those backups into a Lemmy PostgreSQL database, I was using this tool, RedditLemmyImporter. Which, actually looks like it was made by the lemmy developer dessalines moving the r/GenZhou subreddit into lemmy.ml/lemmygrad.ml originally but forked very early to try to obscure that… so I do feel a bit dirty about that, and more so about Lemmy in general from some of the things I’ve seen from dessalines themselves.

          TBH… I think it’s important to make a Fork of Lemmy itself, and to really really comb through this code base. Not sure if this is a long term solution if dessalines is still the head dev, one of the reasons I’m setting up other forums on this server as well. Lemmy and non-corp/federated social media is good, but I’m not liking the stewards of the lemmy code base the more I read their direct words/actions/history/and code like this importer.