Google Will Pay Reddit $60M a Year to Use Its Content for AI: Report::The move boosts revenue for Reddit ahead of its planned stock launch.

  • bbkpr@lemmy.world
    cake
    link
    fedilink
    English
    arrow-up
    13
    arrow-down
    2
    ·
    9 months ago

    Since half or more of reddit is now bots and shills, I don’t imagine the training data is going to be great. That’s fine, Gemini already sucks, so it’ll be hard to make it worse.

    • Dexx1s@lemmy.world
      link
      fedilink
      English
      arrow-up
      3
      arrow-down
      2
      ·
      9 months ago

      The data being generated now sure, but there’s still the years of actually useful data there.

      Then add on the remaining half of comments that are from sensible users and it’s a decent, and still fairly unique, dataset.

      • bbkpr@lemmy.world
        cake
        link
        fedilink
        English
        arrow-up
        5
        arrow-down
        1
        ·
        edit-2
        9 months ago

        There are many, many, many things posted as fact over the years on reddit that are not only untrue, but dangerous or even deadly in the case of some of the most idiotic advice given. I wish good luck telling them all apart to the poor 3rd world contractors the big commercial AI companies exploituse to “train” their stochastic parrots.

        • GluWu@lemm.ee
          link
          fedilink
          English
          arrow-up
          2
          ·
          edit-2
          9 months ago

          That was one of my favorite shitposting formats. I would type a whole paragraph with technical details and real knowledge. Only the people who actually knew what I was talking about would realize its a shitpost.

          • bbkpr@lemmy.world
            cake
            link
            fedilink
            English
            arrow-up
            3
            ·
            edit-2
            9 months ago

            Yep, and a lot of reddit is thinly veiled shitposts, bots, and uncredited karma whoring reposts of stolen content (the commercial AI companies should feel right at home here). Some of them are to anger the self righteous redditors who come to PC police anyone who dares speak against the far left zeitgeist. But most importantly, so, so many of them are just for the lols.

            The scariest part is that those drawn out, apparently accurate but actual nonsense posts/comments, is how many of them end up near the top, with massive numbers of votes from those who think “well that sounds reasonable,” but know nothing of the subject itself.

            Semi-related: I really loved the shitposts where the guy would tell an elaborate story, and end it with his dad beating the shit out of him with jumper cables. Now that’s quality reddit content.