• Avid Amoeba
    link
    fedilink
    English
    arrow-up
    120
    arrow-down
    1
    ·
    4 months ago

    Perhaps it’s becoming clear that search needs to become a common cooperatively managed infrastructure similar to Wikipedia. That this is in the best interest of everyone but advertisers and spammers.

    • Bizarroland@kbin.social
      link
      fedilink
      arrow-up
      66
      arrow-down
      3
      ·
      4 months ago

      Too bad the Mozilla foundation didn’t pivot to that instead of whatever the hell they’re doing with AI

      • Avid Amoeba
        link
        fedilink
        English
        arrow-up
        27
        arrow-down
        2
        ·
        4 months ago

        Truly. I wonder if ActivityPub could be utilized to create a resilient search engine that shares the cost among federated instances. We already have something like that in Lemmy and Mastodon where federated data can be search from any instance. If the data is pages crawled by some automatic crawler which is then federated across instances which in turn allow to search through it, perhaps it might resemble a search engine. Page ranking beyond text matching could even be done by peoples up/down votes instead of some arbitrary algorithm. Similar to how voting works on StackExchange or Lemmy. 🤔 I’m sure someone is thinking about this.

        • deur@feddit.nl
          link
          fedilink
          English
          arrow-up
          36
          ·
          4 months ago

          The answer to your question is no, federation is not an appropriate model for internet scale search.

          • Sigh_Bafanada@lemmy.world
            link
            fedilink
            English
            arrow-up
            5
            ·
            4 months ago

            Yeah I think you need a centralized system with decentralized ownership, so that no single party can fuck it up by themselves

          • Avid Amoeba
            link
            fedilink
            English
            arrow-up
            1
            ·
            edit-2
            4 months ago

            Just to be clear, what I’m referring to here is that a search would occur on a single instance. E.g. searches on lemmy.world occur on the lemmy.world instance, and load lemmy.world’s servers. The federated part is in the building the database on lemmy.world. E.g. a crawler or a user on lemmy.ca adds a new web site and that record is federated to lemmy.world to add to its database. Another user on feddit.de upvotes a search result and that upvote is federated to lemmy.world so that the search result shows higher for users searching on lemmy.world. In this kind of model individual search instances could in fact be very large based on their usage. If there’s no limit to what’s federated, that would put a lower bound on the size of instances. If there’s a limit (something dumb like federate only search records for *.fr domains) then that would allow for smaller instances that don’t have the compute and storage for the complete index.

        • umbrella@lemmy.ml
          link
          fedilink
          English
          arrow-up
          2
          ·
          4 months ago

          the biggest question would be how to defend it from spammers and corporations with potentially much more money.

          • Avid Amoeba
            link
            fedilink
            English
            arrow-up
            2
            ·
            edit-2
            4 months ago

            One answer that’s proven to work is by involving a lot of people’s labor in the editorial/curation process. Similar to how posting/commenting/voting/moderation work on Lemmy, how it’s worked on Reddit and other human-driven platforms. Corporations have proven on multiple occasions that paying for this labor is not feasible and so a system that depends on it should be corpo-resistant or capital-resistant.

            • umbrella@lemmy.ml
              link
              fedilink
              English
              arrow-up
              2
              ·
              edit-2
              4 months ago

              well reddit did that and was full of shills and bots, vote manipulation, and more, this approach completely failed for them.

              and they do put a lot of money into it.