• alessandroOP
    link
    fedilink
    arrow-up
    3
    ·
    7 months ago

    …after being feedup with elevenlabs (popular TTS service) I choose to use a different route.

    TTS is made with ttspeaker… and then added a bit of “flavor” by passing the output to a RVC model to give the newscaster a more poignant flavor. Let me know what you think about.

    (also I am planning to change the newscaster model)

    • olicvb
      link
      fedilink
      arrow-up
      2
      ·
      edit-2
      7 months ago

      Have you looked into doing TTS on your own hardware? I recently managed to get this running (free !!).

      It’s pretty good imo, i found that the trick is to generate the voice with Bark and use “RVC Beta Demo” on top after. Coming from image generation RVC feels like a Lora.

      Now i barely played with it, but i’m sure you can end up with some crazy level of details and customisation.

      Have fun ^^

      • alessandroOP
        link
        fedilink
        arrow-up
        2
        ·
        7 months ago

        The current problem with running these AI on local hardware is that, as supposedly tiny tools, they require huge packages to download and often require specific version of Python (3.8 while most modern Linux distro come with 3.10+) and most of the time you’re required to make these massive download (+6GiB of libraries, pip packages and various dependencies/sdk)… just to give one single try. If you mess with something, it’s all over again.

        What I’ve found more useful, is using huggingface.I was forgetting about Bark! thanks for remind me… luckily is already available on hugface here