- cross-posted to:
- [email protected]
- cross-posted to:
- [email protected]
I have created some software that is capable of synchronising posts from Reddit to Lemmy. It’s still a little rough around the edges, but it works as a such:
People can request new subreddits to be mirrored on [email protected]. A bot (open source) will monitor the threads there, and if it finds a new request for a subreddit, it will make a new community on the Lemmit server, and add it to its monitored list. It will then make periodic checks to see if any new posts (it doesn’t copy any comments) have been posted on reddit, and copy those over.
Users can then subscribe to those communities from their own lemmy instance, and from there federation will pick it up. Or at least, that’s the theory. At the moment, federation is not working awesomely, and that is where my lack of fediverse knowledge comes in. Maybe it needs more time, or something is not so properly - I don’t know.
Furthermore: registrations on this server are closed. The point of this service is not to become a community on its own, but to deliver, ehh, “original” content to all the rest of the Fediverse while it’s going through a ramp-up phase. Besides, the instance is running on a pretty small vps, and I rather have this thing manage itself. There is a [email protected] community for further questions about the project itself though, in case people want to discuss it further.
So ehm… Let me know what you think :)
This rases a question, if a bot creates a new community to replace subreddits, who will mod those communities to ensure that there’s no bigotry, trolls, transphobia, homophobia etc… running wild in the comments? Who will manage these?
Well, as the bot is the moderator, and as the person running the bot is responsible for it, I suppose that’s technically the answer. But it’s an excellent question if there are hundreds of communities created and people start posting comments in them. The easy workaround is for the bot to set each new community to read-only (by checking ‘only moderators can post’). But, that would be a bit unfortunate as then that limits opportunities to easily chat about it. I suppose cross-posting by someone that wants to comment on it is a solution.
It was just a matter of time before something like this showed up.
I’m sure there will be a number of people that won’t be a fan of this but I think it’s a pretty innovative way to help the chicken-and-egg problem of early adoption. (No users because there’s no content, no content because there’s no users).
Very smart to have it limited to one bot on one instance to make it easy to block (or de-federate) for those that don’t like it, but I do.
Interesting idea! I have some thoughts if you’re open to feedback:
Furthermore: registrations on this server are closed. The point of this service is not to become a community on its own, but to deliver, ehh, “original” content to all the rest of the Fediverse while it’s going through a ramp-up phase.
Have you considered moderation? These mirrored communities on lemmit.online will still be getting comments from all over the federated network, and if you’re the only user and sole moderator of every community, then it might get quite overwhelming!
Besides, the instance is running on a pretty small vps, and I rather have this thing manage itself.
Just in case you’re not aware, your instance will need to be able to handle:
- Pushing out posts and comments to all other instances in the network
- Accepting comments and votes from subscribers on any instance from the network
A small VPS might not be able to handle that!
It will then make periodic checks to see if any new posts (it doesn’t copy any comments) have been posted on reddit, and copy those over.
How are you planning to deal with API limits from Reddit? Without paying, at most you’ll be able to make 6000 requests per hour, which means that you’ll only be able to get new posts from the last hour for up to 6000 subreddits. It might seem like a big number, but consider that there are (according to some old posts online) over 100,000 active subreddits.
Interesting idea! I have some thoughts if you’re open to feedback:
Always!
Have you considered moderation? These mirrored communities on lemmit.online will still be getting comments from all over the federated network, and if you’re the only user and sole moderator of every community, then it might get quite overwhelming!
I have, and I hope it won’t be a problem ;) I’m a software engineer, as mentioned above, have little interest in managing people outside of work :P If anyone wants to become a moderator, they’re free to request it.
A small VPS might not be able to handle that
We’ll see how well it does. I don’t mind spending a little money on this (few dozen €/$ per month), if it takes off. In the end though, it’s more meant as a kickstart for Lemmy content than anything else.
How are you planning to deal with API limits from Reddit?
HA! By not using the API. For starters, because someone-who-isnt-me would like to browse NSFW content. I do a bit of client-side throttling between requests, which I hope will keep me under the radar. But it’s mostly based on rss for the subreddit overview, and scraping for the individual posts.
In the end… we’ll just have to see how it goes.
The developer isn’t using the API. They’re scraping according to my question and their response to this above. However, the moderation question is a really good point. The easy workaround for this is just set every new community as ‘only moderators can post’ and then it’s just content read-only.
So posts from reddit are going to flood the All feed on lemmy?
Yups. It’s all done by one bot though, so you’ll just have to block that to get rid of them.
oh good! thank you so much for the reply. It’s beilliant dev work on your end, and certainly nothing I’d be able to pull off aas I dont have the beain capacity, but I’m just over Reddit at this point and dont want to watch them spiral. Thank you for maaking sure there’s a way to opt out!
I’m absolutely with you on that on. If anything, part of the reason I wanted this was to have some fresh content that I knew I was going to miss once after July 1st. Stuff like [email protected] for example. There’s a bit too much reddit-circlejerk going on right now, even here imho.
That sounds like a quick way to get defederated from everywhere
If that’s what happens, that’s what happens. ¯\_(ツ)_/¯
I’m just here to offer a service for people who Do like it.
I kinda get what you’re saying, but not really. Why would adding more content such as this lead to an instance wanting to defederate from that instance? There’s no users from Reddit there (and thus no comments from them) to troll, spam, or break the rules. It just adds posts.
Defederation isn’t a bad thing, it’s just curation. Especially in this case where you’re not breaking any human-human connections.
I’d still prefer something along the lines of Masto’s silencing, which would get rid of instances from global feeds yet still allow follows and interactions, but upstream already has too much on their plate and I’m sure given the time something similar would be implemented.
nice work!
Cool idea. I’m not sure why, but I can’t seem to get my instance to see lemmit.online, even when searching for it.
Okay, just tried it out. Added /r/bestof and it’s working. Very cool!
I’m guessing this scrapes? Otherwise it’ll stop working when the API changes happen on July 1.
Yups. Combination of scraping and rss. With a bit of client-side throttling thrown in to stay under the radar.
Looks great so far. I like how the posts that are pulled over have both the link to the content from the original Reddit post as well as a link to the Reddit post itself.
Is there a possibility to release just an RSS bot? I’d love to have certain RSS feeds from various sources auto post to my community. I’m finding it difficult to find options for lemmy atm
I think @[email protected] wrote something to that effect (I’m still a mess with making proper links on here :/)
And I also found something else that was written in java (not javascript).
The downside from using the RSS feed is that it doesn’t contain the whole body, which my scraper does fetch.
Can’t subscribe from my account on lemmy.world any idea what could be?
The communities don’t show up in search and if I type the community directly I get an error 404: couldn’t find community. Example: https://lemmy.world/c/[email protected]
I’m having similar issues on lemmy.ml, not sure what causes it.
Well… There is the thing that I had this service running on another host at some point, it got federated with quite a few instances (including lemmy.ml and some others), and I had to reinstall it. Maybe those instances have a hard time accepting the new installation?
I’m kinda hoping it will sort itself out over time, maybe those instances just need to restart 🤞 .
Use the search function in your instance, and you’ll likely have to click the ‘search’ button twice to bring it in. Put the original URL from the developer’s instance in the search, like this:
https://lemmit.online/c/IdiotsInCars
Then you should get a link for it.
You get the 404 because your instance doesn’t know about that community yet. It has to be made aware, and then connected to it via you doing the search.
Just tried and still doesn’t work. Do you get any results when doing it that way?
I just tried it again from my account on lemmy.world, and it worked fine the first time. No issues, the link is there.
Just to be clear. From your own instance, which is lemmy.world, click the search icon in the upper right. In the search bar type:
https://lemmit.online/c/IdiotsInCars
and press the search button. You should get a link for the community. Click that to go to it and subscribe.
It worked!
So I was using the community search here: https://lemmy.world/communities That one wasn’t working, but using the one on the top right, that one worked! Thanks!
Awesome, glad to hear it!
Yes, it does indeed work okay from here.
I’m interested in having Reddit content show up in my feed when I use Jerboa, but for the life of me, I can not figure out how. Can anyone help? You’d have to explain it to me like you’re talking to either a 3 year old, or a 93 year old.
Step 1: go to [email protected] and check if the subreddit you want has already been requested. If so, it contains a link you can use to subscribe to it. If not, read the sidebar and make a post in the request community.
Do note that Lemmit will only copy over the starting posts, not the comments. So there is no point in requesting subs like /r/askmen etc.
Let me know if you have any further questions, and please be specific.
One thing that would be nice: The links in the messages saying:
I’ll get right on that. Check out /c/…
Would be nice if the link for /c/… was for lemmit.online. If not on there already, they don’t work since they go to the current server’s community with the given name.
Edit: This is open source… Here’s a merge request that I think fixes it: https://gitlab.com/sab_from_earth/lemmit/-/merge_requests/6/diffs
Thanks, I was gonna do this tonight, but are you sure about that format? (I’m still relative newbie to the Fediverse)
Just to test: /c/about.
edit: Looks good, I’ll merge it :)
edit2: Deployed :)
Thanks!
Is this going to break come the end of the month, when the new API changes go into effect?
Nope, it doesn’t use the API, but relies on the RSS feed and scraping old.reddit.com And those will probably also die at some point in the future, this will probably keep working a bit longer.
Just to be clear, what if there is already a community on Lemmy that coincides with a subreddit? Will it make another community on your instance? Or will it use the existing community?
Yeah, the bot only operates on its own instance. So you can have [email protected] and [email protected].
Makes perfect sense. Thanks.
There is no such thing as already the same community on another instance on current Lemmy. As of now all Lemmy instances can have e.g. a /c/cooking community, and they are all guns be individual communities.
There is no such thing as already the same community on another instance on current Lemmy.
Yup, I know. Not what I was asking, though. I was asking if the posts would go to an already established community somewhere on Lemmy, and the answer is ‘no’, instead they go to a new community on the developer’s instance.
As of now all Lemmy instances can have e.g. a /c/cooking community, and they are all guns be individual communities.
Yup, I’m aware.
It sounds as if this will run on its own instance, so as long as this instance is purely dedicated to mirrored subreddits there will be no conflict with communities on other instances. Community names can be reused on multiple instances and they are treated as separate communities.