A few different options with how to host the archives.
Here’s what /r/datahoarder is doing with redarc
We could import it here, put it in a seperate community on this server, host it with redarc on a subdomain, it’s pretty much whatever.
I’ll put a survey up once I finish that server again for a vote, thought a discussion would be good to have prior to that going up.
Thoughts??
So you would put it under a different, archival community? That doesn’t sound like a bad idea. We don’t know that it will last on reddit, and it opens up the content to be accessed outside of reddit. It could draw users to FC and to lemmy.
On the other hand, starting fresh can be nice. The archive is inseparable from reddit and their influence. One example: All of the posts in the archive abide by reddit’s admin rules, and it may not reflect the opinions of the mods/maintainers/community. So this could also be an opportunity to hit reset.
I think I lean toward uploading and hosting it.
Either a separate archive community or just right into vaporents on BeyondCombustion + a bot to scrape new posts from the Reddit RSS feed and repost them here (but under a bot account).
I think it would be more engaging for lemmy at first to have posts pulled in at the beginning at least, until conversation takes place more naturally
Absolutely going to host it 💯. Just trying to decide exactly what format…. If it works well I’d be willing to do the same for other subreddits, I’ll be documenting the steps but having PostgreSQL access on the lemmy server is a non starter for most subreddits/people/mods; unless they have their own infrastructure too.
I like the idea of a separate community but at the same time there is definitely value in the continuity of transitioning the sub (I assume reddit will kill it sooner or later tbh).
I think I get an idea of the scope of the task. I agree it’s probably too intensive for most communities, but I’m sure others would be interested. What’s involved with getting the archive? Is it scraped, or something you can download as a mod? I can think of a community or two that might appreciate a new home away from their mods…
Also, if there are communities out there that you’re interested in helping do this they have to either
or
I’m sure there are other ways to import it, like scripting something that literally re-posted everything through API calls to Lemmy or ugh clicking through the web GUI lmao. Without that database access I’d consider it too much work/hassle to be practical.
Agree they will at some point nuke that subreddit.
Pretty much everything has been archived that could have been, since the API goes dark in ~2 days.
There’s details of what apps were used, where to download the archives directly, links to torrents and such on a post from /r/DataHoarder which I’m collecting links/text/guides from over in our Gitea instance as well as importing projects used for this effort.
So far, there’s like ~5-6TB of archives I’ve downloaded through the links in that post, others on /r/DataHoarder, the-eye.eu, etc.
They go back allll the way to 2005… It’s just text though, no media. Unless the media was a link to Imgur or YouTube or something, then the links are in the posts.
There’s a couple bots/scripts that will repost new stuff moving forward from RSS feeds.
To inject directly from those backups into a Lemmy PostgreSQL database, I was using this tool, RedditLemmyImporter. Which, actually looks like it was made by the lemmy developer dessalines moving the r/GenZhou subreddit into lemmy.ml/lemmygrad.ml originally but forked very early to try to obscure that… so I do feel a bit dirty about that, and more so about Lemmy in general from some of the things I’ve seen from dessalines themselves.
TBH… I think it’s important to make a Fork of Lemmy itself, and to really really comb through this code base. Not sure if this is a long term solution if dessalines is still the head dev, one of the reasons I’m setting up other forums on this server as well. Lemmy and non-corp/federated social media is good, but I’m not liking the stewards of the lemmy code base the more I read their direct words/actions/history/and code like this importer.