About 20 days ago, I had made a blog post about an idea I had for a better federated search engine model.

It didn’t take much time for it to develop into a thing I am fixated on.

I am putting the code out, its not ready or working, but it is something I am really happy to make and has filled my time with joy designing.


My current plan is the following:

  1. Get the basic web-ring creation process down
  2. Get scraping jobs functional
  3. Provide a basic query system
  4. Implement basic user accounts
  5. Implement basic federation
  6. Implement basic moderation

Once I am done with the core features that I have in mind, I will start working on adding more features and quality of life improvements.


Some features I want to work on to make this software more enticing to administrators:

  1. The ability to customize what is publicly accessible.
  2. The ability to edit the pages HTML style on the fly, without having to recompile.
  3. Containers for easy deployment.

In regards to application design, I am taking pages from my book in developing Android applications, along with cherry-picking from projects @[email protected] made.

  1. MVC design, with static pages to provide the fastest loading experience for users
  2. Bootstrap to make the pages responsive for any device
  3. Diesel to abstract database interaction and migration.
  4. Handlebars for view templating
  5. Axum as the HTTP core

Hopefully these design decisions make my application as debt free as possible.


If you have any advice or suggestion, please do give, I want to know how I can do better or avoid common pitfalls for newcomers!

If you have criticisms, please be constructive and have empathy towards the fact of me doing this because it makes me happy.

  • theneverfox@pawb.social
    link
    fedilink
    English
    arrow-up
    1
    ·
    23 hours ago

    I love the idea

    I’m starting to look at airflow for my own project, not sure if you’ve heard of it or projects like it, but it seems like a great foundation for a scraper. I’m still evaluating options for that, but so far it’s my pick

    Hit me up if you get stuck or make a breakthrough, I’ve got a pretty good handle on activity pub and the lemmy API, and your project would add a lot to mine

      • theneverfox@pawb.social
        link
        fedilink
        English
        arrow-up
        1
        ·
        2 hours ago

        Basically it’s this system to do all kind of directional acyclic tasks, primarily based around data ingestion. It’s very flexible and powerful, which also means there’s a steep learning curve.

        To give an example, you could have a task that gatherers a list of instances and updates the database. It could also spawn a new task for each one to check if the server is up and get the version number, and you could even have it email you to create an account for new instances.

        Then from the task that made sure the server is up, you could spawn a new task that gets communities, which then spawns new tasks to ingest posts from it

        And when this whole process is done, you could have it kick off a new set of tasks to do the indexing or whatever else on the up to date data set

        It has some nice visualization of the process, you can allocate workers across devices, you can kick off the process through an API… You can use it to do anything from monitoring to scraping and doing map reduce on it. You could even federate and wire into activity pub directly, use their apis, or mix and match with scraping

        I’ve never worked with crawlers and I’m not sure what angle you’re going to attack this from, but if normal crawlers don’t play well with the fediverse this is an option