AI and legal experts told the FT this “memorization” ability could have serious ramifications on AI groups’ battle against dozens of copyright lawsuits around the world, as it undermines their core defense that LLMs “learn” from copyrighted works but do not store copies.

Sam Altman would like to remind you each Old Lady at a Library consume 284 cubic feet of Oxygen a day from the air.

Also, hey at least they made sure to probably destroy the physical copy they ripped into their hopelessly fragmented CorpoNapster fever dream, the law is the law.

  • Archangel1313
    link
    fedilink
    arrow-up
    30
    ·
    1 day ago

    Doesn’t this just mean they copied the original text, and still managed to get some of it wrong?

    • VitoRobles@lemmy.today
      link
      fedilink
      English
      arrow-up
      27
      ·
      1 day ago

      They don’t copy the book and store the words in a database or anything. LLMs don’t have a brain or storage.

      They copy it, convert pieces into numbers for its vector database, and mathematically reconstruct it when you ask it a question.

      Since it’s reconstructing it (with math), it hallucinates and gets it wrong…

      • lectricleopard@lemmy.world
        link
        fedilink
        arrow-up
        16
        ·
        1 day ago

        I like this way of thinking about it, but I would scare quote that “hallucinates.” Its more like its been encrypted, and then decrypted with an imperfect algorithm. Or like a lossy compression and decompression.

        We have mathematical understanding for these things. Its not a mysterious thing like the human brain still is for science. Personification of them is an unfortunate side affect of the fact its designed to emulate human intelligence and uses natural language in a sort of “conversation.” It does more to obfuscate the real nature of them than it does to explain them.

        • AliasAKA@lemmy.world
          link
          fedilink
          English
          arrow-up
          11
          ·
          1 day ago

          This, and lossy compression is exactly right.

          Alternatively, it’s a decomposition of a big matrix (think very large excel) wherein each cell is a probability you observe every other word (really its tokens of course but for sake of argument) given that you’ve observed other words. Like, you could literally make a transformer in excel. It wouldn’t run, but that’s excels fault, not the math.

          Aside: but I’m pretty sure distributing a lossy compression and decompression algorithm is distribution, and charging for it is also there. Realistically if this is allowed, anyone should be able to pirate anything for any reason legally as long as it’s passed through a lossy compression and decompression first.

          • lectricleopard@lemmy.world
            link
            fedilink
            arrow-up
            7
            ·
            1 day ago

            Yeah, there isnt much of a difference as far as how the data is transformed between your pirating case and and the case of an ai providing copywritten material. It really is only because they treat it like an artificial person that they are able to convince people it should be allowed.

            The kick in the teeth is, if I charged people for me to recite a copywritten novel, that I memorized but dont have the explicit permission to use, I’d be sued. There really is no way to argue this should be allowed that doesnt immediately fall apart if you pull it apart even a little.

    • supersquirrel@sopuli.xyzOP
      link
      fedilink
      arrow-up
      14
      arrow-down
      1
      ·
      1 day ago

      I didn’t cheat on you, I just didn’t realize I was making love to an entirely different woman! They are different OK!!!

      • WoodScientist@lemmy.world
        link
        fedilink
        arrow-up
        5
        ·
        1 day ago

        That’s a interesting question. Think of the Star Trek holodeck. If someone creates a perfect holodeck recreation of their own partner, and sleeps with that simulation, is that cheating on their partner? Let’s assume it’s not one of those fancy sentient holograms like the doctor, just a regular mindless one.