Transparency in moderation

auk@slrpnk.net · edit-2 4 months ago

Transparency in moderation

Five@slrpnk.net · edit-2 4 months ago

I think you’re right to be concerned with the trade-off between transparency and privacy. ActivityPub infrastructure technically exposes a lot of things that would be only shared between company employees and their advertising partners in other social media, but due to the discretion of the people implementing front-end software, most of that information is not exposed to the general public. While the Fediverse has technical transparency, it also has functional privacy. The developers of Lemmy frontends deserve a lot of credit for their caution and care.

I think a spot-check on a member of the Fediverse would make the inner workings of SantaBot easier to understand. It may be difficult to do that though, without breaking some of the norms about member privacy that we have been carefully building as a culture.

One solution is to allow members to opt-in to allow their Santabot analysis to be shared publicly. I think I might be one of your borderline cases; I give my consent if you’d like to use me as an example.

auk@slrpnk.net · 4 months ago

I agree. I think spot-checking can do a lot to bring transparency into the picture, and if it’s done carefully, then it’ll be possible to avoid exposing too much about people who haven’t agreed to have it exposed about them.

I thought about it for a while, and I think doing a weekly spot-check post for a handful of controversial users, showing a visualization of their rank and where it is coming from, might work. Here’s one quickly hacked-up example in the form of a bar code. Time goes from left to right, blue stripes are positive rank, and red stripes are negative rank. Here’s your breakdown for the last month:

There are three big red stripes. From left to right, they are these threads:

There is also plenty of blue, though, so you’re comfortably over the line as a nice person under the current parameter set. It’s worth mentioning that a lot of the blue stripes are “unpopular” opinions from the point of view of the average liberal, that are popular on Lemmy, or detailed takedowns of MBFC:

My opinion is that most of the time, someone who’s garnering a healthy mixture of blue and red is probably showing good faith, and when someone is managing to garner mostly red, it’s more likely to be an issue of quality of engagement, not even necessarily that they’re trying to say something unpopular that the bot is then censoring. But, of course, the proof is in how it works in practice on real users and real content.

I think doing some type of visualization, maybe automatically generated, and showing the progression over time of someone’s rank depending on particular comments, can help to inform the discussion. I’m sure it won’t stop people from accusing me of all kinds of malfeasance in the way the bot operates, but it can help to put more eyes on it from people who are open and interested in seeing how it’s working.

Five@slrpnk.net · edit-2 4 months ago

I love this – Reddit used to do a yearly thing where they’d send you your top upvoted and downvoted posts and comments that was always nostalgic and fascinating to me as a user. Like canvas, I think it’s an idea worth copying with a more federated framework.

Maybe you could write an action that allows Fediverse members to get a similar breakdown and visualization automatically generated and then delivered to them via direct message. People who are curious about how the bot works can message the bot and see how it views them, and then they can share the details publicly if they so choose. I think this could be really popular.

auk@slrpnk.net · 4 months ago

How about this?

That’s 30 days of Santa’s ranking for your user, showing the comment threads that made big impacts up or down. The dotted horizontal line is 0, and the cutoff for banning a person is down below that line. Here are some anonymized examples of people who got banned:

They were doing well until, in the pink part, they posted 28 comments heatedly insisting that there’s no genocide in Gaza.

I think this is informative about how the system works without being useful for gathering analytics to rig the system. You can see what kind of participation impacts it in what ways, and how to put it into the context of the sum total of your participation for the month, but the emphasis is on the comments and behavior instead of on the math. What do you think?

Five@slrpnk.net · 4 months ago

Yes, this is very informative.

It’s an instructive visualization, but I like it less. The spectral timeline shows how big the changes are and places them in chronologically, and you can see from a distance how contentious the month was. The line graph tells a story about being rewarded or punished for being agreeable or contradictory to the zeitgeist. It reads like the timeline for an American FICO progress graph or a Chinese social credit score, things I have a visceral reaction to. It’s a dopamine hit to have a comment collect upvotes, but I’m more proud of positions that I’m confident will age well with time and were presented well, but were downvoted anyway. It is evidence that I’m not in an echo chamber, and I’m not being ignored. If I could pick which graph I got delivered to my stocking, I’d pick the spectral timeline.

The line graph is clearly better suited for discussing how the system functions though. For example, it appears a new member won’t get banned for a few negative interactions early in their career, as the cutoff is below zero. The second banned user it appears, if they wait 15 days, will have a positive Santabot assessment regardless of how far down the valley they’ve gone during the start of the month. You chose the right level of detail to maintain their anonymity.

auk@slrpnk.net · 4 months ago

What about this?

I see what you’re saying. The line graph feels kind of paternalistic. It’s saying that if you disagree with the herd, you’re going to lose your value. I think the spectral timeline with a legend may be better, at least for a frequent posting and followup use case.

The line graph is clearly better suited for discussing how the system functions though. For example, it appears a new member won’t get banned for a few negative interactions early in their career, as the cutoff is below zero.

Yes. We give some leeway so that someone doesn’t get penalized for a single random downvote early in their career, but we still need to be reactive enough that if someone makes a new account and posts a garbage comment, we jump on it. I have a process that’s meant to deal with that, but it’s tricky. I’m still working it out, and I rolled it out a little bit early so that it’s now jumping the gun and deleting some comments from people who really shouldn’t have their comments deleted.

It’s tough because it’s hard to test in the abstract, and by definition, the people who don’t comment a lot don’t leave too many comments to be able to use as test cases. What I’m planning to do is work on it a little bit more, testing in production, and once it’s worked out, I’ll make a post explaining it all.

Five@slrpnk.net · 4 months ago

I gotta say, you’re really good at making visualizations. I like this one best, but even the ones I liked less were extremely informative and readable.

auk@slrpnk.net · 4 months ago

I played around with possibilities for a while, and did some more fixing and tweaking of the algorithm and visualization tools. Here’s one way I think it could work. Once a week, the bot could post a breakdown of three random users who are permitted to post, and three random users who aren’t permitted to post. Right now, that breakdown would be:

Permitted to post:

Not permitted to post:

That means that anyone who wants to can check up on how it’s making its decisions. Then, in addition, anyone who wants an explanation for their user, I can do that.

Those charts are anonymized. I’ll send the users in question to some of the admins to see what they think. I think it’s okay to post a few users, as long as it’s random and not repetitive. I don’t think it would come across as singling anyone out or making them uncomfortable, but I’m curious what other people think.

Five@slrpnk.net · edit-2 4 months ago

It might be fairly easy to de-anonymize users as not all users post in all threads, and identifying a user based on which threads they post in and generally what the response was to their posts isn’t impossible.

On the other hand, it doesn’t reveal information that we’ve decided should be treated special, like who is voting in the comments and posts. When posting a controversial twitter screenshot of a non-public figure, it’s internet etiquette and good form to blur the target’s name, even though the tweet can be found via text search. This ups the effort to attack the user a little, but also communicates through actions that trolling is being discouraged – which I think is the most effective deterrent.

The measures you’re taking seem to be in line with that internet etiquette. Especially considering the relatively small exposure your project is getting (at this point it seems it’s just us talking in this thread, for example) the precautions you have in place should be enough. You may consider revising this if you get complaints of harassment or when your project develops a much larger audience.

Five@slrpnk.net · 4 months ago

Is there a cap to how much stomp a user can have through their votes? By accumulating enough zeitgeist points, can a single user ban a new user from !pleasantpolitics with downvotes?

auk@slrpnk.net · 4 months ago

There’s not a cap. That type of activity is, in fact, a classical failure mode of this type of network. Just like people learned to build link farms to artificially give page rank to fake pages, people can learn to farm for zeitgeist points to then give or take away rank from some targeted user. That is one reason I’m being cagey about giving away introspection tools or detailed road maps of people’s points. I don’t want to facilitate someone getting feedback about how well an effort like that is working.

I’m a little more concerned about people accumulating points and then upvoting a troll account to make sure it doesn’t get banned, than I am about people downvote-bombing someone they disagree with to ban them. They are both concerns, though. There are ways around both through tweaks to the algorithm, but I’ve constantly been surprised about how the tools work out in practice as compared to my theory about them, and so I’m waiting until it happens before I start messing with solutions to it. I do have some ideas in mind for how to deal with it. I am guessing that in the long run, it won’t be too big an issue, but I want to see how it works out in practice before actualizing the countermeasures I was thinking of.