So I’ve been troubleshooting the federation issues with some other admins:
(Thanks for the help)
So what we see is that when there are many federation workers running at the same time, they get too slow, causing them to timeout and fail.
I had federation workers set to 200000. I’ve now lowered that to 8192, and set the activitypub logging to debugging to get queue stats. RUST_LOG="warn,lemmy_server=warn,lemmy_api=warn,lemmy_api_common=warn,lemmy_api_crud=warn,lemmy_apub=warn,lemmy_db_schema=warn,lemmy_db_views=warn,lemmy_db_views_actor=warn,lemmy_db_views_moderator=warn,lemmy_routes=warn,lemmy_utils=warn,lemmy_websocket=warn,activitypub_federation=debug"
Also, I saw that there were many workers retrying to servers that are unreachable. So, I’ve blocked some of these servers:
commallama.social,mayheminc.win,lemmy.name,lm.runnerd.net,frostbyrne.io,be-lemmy.org,lemmonade.marbledfennec.net,lemmy.sarcasticdeveloper.com,lemmy.kosapps.com,pawb.social,kbin.wageoffsite.com,lemmy.iswhereits.at,lemmy.easfrq.live,lemmy.friheter.com,lmy.rndmm.us,kbin.korgen.xyz
This gave good results, way less active workers, so less timeouts. (I see that above 3000 active workers, timeouts start).
(If you own one of these servers, let me know once it’s back up, so I can un-block it)
Now it’s after midnight so I’m going to bed. Surely more troubleshooting will follow tomorrow and in the weekend.
Please let me know if you see improvements, or have many issues still.
So as of right now, https://lemmy.ca still seems bugged.
These two are 0 comments here on lemmy.world, while comments clearly exist over at lemmy.ca.
The opposite here: I made a test post at [email protected], so lemmy.world thinks there is +1 comment. But the true instance at http://lemmy.ca/c/microcontroller sees 0 comments, so my comment fails to traverse the federation to lemmy.ca.
So both imports and exports to these communities on lemmy.ca seem bugged.
I’m having very similar issues between my Lemmy.ca and Lemmy.world accounts as well.
I was actually trying to post this comment using .world, but it was lagging. So I switched over to my .ca account lol.