• southerntofu@lemmy.ml
    link
    fedilink
    arrow-up
    5
    arrow-down
    1
    ·
    3 years ago

    Fun fun fun. Where’s the source code comrade? :)

    Also, can you maybe tell us more about what data set you trained your model with?

    • wazowski@lemmy.mlOP
      link
      fedilink
      arrow-up
      4
      ·
      3 years ago

      ty 🤗

      the code for the bot isn’t yet ready, because i just started the project, and i’d be comfortable releasing it once i added some more functionality and customisability, around the moment i’d be making the bot public

      and there’s one issue i have to overcome before making the bot itself public: the computer i’m running this bot on is kinda old (like 2010), and it has a gpu that doesn’t support recent versions of cuda, which is the gpu compute library used by one of the libraries in this project, therefore all the inference is run on the CPU, which is slow and very computationally intensive

      i’m not sure what to host this on: i’ll probably wait until the GPU prices have come down, and see if i can purchase some modern-ish GPU in order to run this on, bc renting so much GPU compute can cost a looot

      i’m using a pre-trained model for this called BART, because training a good NLP transformer model of such magnitude (0.5 billion parameters) is an astronomical task not only in terms of computation, but also in terms of the required expertise, which i certainly do not possess at the moment (i just graduated high school lol)

      let me know if you’d like to use something like this bot :)

    • wazowski@lemmy.mlOP
      link
      fedilink
      arrow-up
      1
      ·
      3 years ago

      ty :)

      i’m just starting to learn the telegram bot api, so idk what’s possible and what’s not 🤷‍♀️

    • wazowski@lemmy.mlOP
      link
      fedilink
      arrow-up
      1
      ·
      edit-2
      3 years ago

      hi, could you tell what kind of features/functionality you’d like to see in this kind of bot?

      • choosing the min/max character number the resulting text?
      • choosing which model for summarisation you’d want to use?
      • how long would you be comfortable waiting for a response before you just give up?
      • any other suggestions?