• tojikomori@kbin.social
      link
      fedilink
      arrow-up
      7
      ·
      edit-2
      2 years ago

      This reply’s interesting:

      How can data licensed under the CC-BY-SA licenses (that SO content is licensed under) be “misused”? The license explictly allows others to do essentially anything they want with the data as long as attribution is given, in particular profit off of it.

      When SO content is applied as parametric knowledge I’d expect the outcome to fail both the “BY” and the “SA” clauses, since model interpreters can’t provide attribution for it and their output won’t share the license. That’s true even if output is considered public domain: CC-BY-SA content can’t be moved into a public domain equivalent license. It seems practically indistinguishable from using any other in-copyright content as training material.

      None of that’s to say SO is right to stop data dumps. It feels like they’re trying to find a technical solution to a legal problem, perhaps even one that rises to criminality on the part of Open AI and others?

  • lightrush
    link
    fedilink
    arrow-up
    12
    ·
    edit-2
    2 years ago

    Please let this not be a sign that the enshittification of StackExchange has begun.

    • Garrathian@beehaw.org
      link
      fedilink
      arrow-up
      3
      ·
      2 years ago

      Well i know mods at stackoverflow were wanting to mutiny because the owners wanted to start incorporating AI responses to questions posted there or something like that

      • AbelianGrape@beehaw.org
        link
        fedilink
        arrow-up
        6
        ·
        2 years ago

        They are not allowing moderators to remove replies based solely on the fact that they were written by AI, regardless of how much evidence there is to that fact.

        I only ever interact with stackoverflow to read like 10-year-old responses to random problems I run into and even I want the moderators to mutiny over that. It’s arguably more serious than what Reddit is doing, because in many ways SE is the unsung backbone of technology at the moment. AI responses to technical questions are almost always wrong in some important way (even if the main idea is correct) and no moderator or group of moderators can be expected to have sufficiently broad knowledge to always know that an answer is wrong.

        • Garrathian@beehaw.org
          link
          fedilink
          arrow-up
          3
          ·
          2 years ago

          Yeah that’s what I read thanks for expounding, it’s definitely not good as somebody who also uses stackoverflow pretty commonly

        • lightrush
          link
          fedilink
          arrow-up
          3
          ·
          edit-2
          2 years ago

          This is madness. Writing responses to queries in SO using AI trained on the same data or often data of much lower quality is like… a snake eating its tail while shitting diarrhoea. This would just decrease the signal-to-noise ratio on SO. Wait did I describe enshittification… fml