AI and legal experts told the FT this “memorization” ability could have serious ramifications on AI groups’ battle against dozens of copyright lawsuits around the world, as it undermines their core defense that LLMs “learn” from copyrighted works but do not store copies.
Sam Altman would like to remind you each Old Lady at a Library consume 284 cubic feet of Oxygen a day from the air.
Also, hey at least they made sure to probably destroy the physical copy they ripped into their hopelessly fragmented CorpoNapster fever dream, the law is the law.



Doesn’t this just mean they copied the original text, and still managed to get some of it wrong?
They don’t copy the book and store the words in a database or anything. LLMs don’t have a brain or storage.
They copy it, convert pieces into numbers for its vector database, and mathematically reconstruct it when you ask it a question.
Since it’s reconstructing it (with math), it hallucinates and gets it wrong…
I like this way of thinking about it, but I would scare quote that “hallucinates.” Its more like its been encrypted, and then decrypted with an imperfect algorithm. Or like a lossy compression and decompression.
We have mathematical understanding for these things. Its not a mysterious thing like the human brain still is for science. Personification of them is an unfortunate side affect of the fact its designed to emulate human intelligence and uses natural language in a sort of “conversation.” It does more to obfuscate the real nature of them than it does to explain them.
This, and lossy compression is exactly right.
Alternatively, it’s a decomposition of a big matrix (think very large excel) wherein each cell is a probability you observe every other word (really its tokens of course but for sake of argument) given that you’ve observed other words. Like, you could literally make a transformer in excel. It wouldn’t run, but that’s excels fault, not the math.
Aside: but I’m pretty sure distributing a lossy compression and decompression algorithm is distribution, and charging for it is also there. Realistically if this is allowed, anyone should be able to pirate anything for any reason legally as long as it’s passed through a lossy compression and decompression first.
Yeah, there isnt much of a difference as far as how the data is transformed between your pirating case and and the case of an ai providing copywritten material. It really is only because they treat it like an artificial person that they are able to convince people it should be allowed.
The kick in the teeth is, if I charged people for me to recite a copywritten novel, that I memorized but dont have the explicit permission to use, I’d be sued. There really is no way to argue this should be allowed that doesnt immediately fall apart if you pull it apart even a little.
I didn’t cheat on you, I just didn’t realize I was making love to an entirely different woman! They are different OK!!!
That’s a interesting question. Think of the Star Trek holodeck. If someone creates a perfect holodeck recreation of their own partner, and sleeps with that simulation, is that cheating on their partner? Let’s assume it’s not one of those fancy sentient holograms like the doctor, just a regular mindless one.
what if they are the doctor and have sex with a ghost?
That’s just a good old Blazin’ time
I prefer Blazin’ with Bev’
Bev is implied when one is already Blazin’
Eh I would say it’s masturbating to a “picture” of their partner. It’s just a sexy light show. As long as it’s not sentient it can never have feelings back so it’s just a sex toy. Ever hear of a clone-a-willy?
As with a picture, the important part is consent. Was the picture/3D model created with informed consent from the partner that it might reasonably be used for masturbation? If so, then not cheating. Otherwise it is.