I have been trying to create a post in the Canada community. Scuttlebutt is that the post limit was set to 10,000 characters, but has since been set to 50,000 characters. My post has 9961 UTF-8 characters (9969 characters overall, 8396 characters excluding spaces) and when I hit submit the submission never completes.
Wait… what??
If this is only ASCII characters, then the issues make a lot of sense… but then this is also one of the more brain-dead bugs from a programmer’s standpoint. Everything is in UTF-8 these days, especially if you want i18n, as Lemmy seems to do with almost any post/comment submission or community creation. Doing an ASCII character count and crapping out at 5k UTF-8 characters because they are double the bit size is just… really, really bad.
i did another test post (https://lemmy.ca/post/795726) with some emoji, and it appears that it’s counting each one as two “characters”. so yes, it looks like it’s just counting bytes.