The thing is, AI is not a “who”, it’s a “what”. A license that permits human beings to read a thing may not be sufficient to authorize AIs to read a thing.
That’s what the legal system needs to thrash out: is feeding information to an AI a separate right that needs to be assigned specifically by contract or license, or is it a subset of the human right to read a published document that the human has legitimate access to? If it is separate, then any work outside the public domain that doesn’t specifically have an authorization for AIs attached is going to be off-limits as training data, because rights not specifically assigned are reserved unto the copyright holder.
Given the speed with which the law typically moves, it’s going to be years before we have an answer.
“Reading” is the act of viewing the unaltered text. It’s possible for a LLM to have text in its training set that no human has ever viewed in its original form. So no, a LLM is not equivalent to a pair of glasses in this case.
In effect, a LLM is a fancy datanbase that spits back modified subsets of its stored information in response to queries. Does this mean that the original data was “stored in a retrieval system” without permission (specifically forbidden by British copyright law, I believe, and possibly others)? Is the LLM creating “unauthorized derivative works”, which is illegal in many countries? Just where is the line between a derivative work and a work “inspired by” another, but still legal, anyway? Is it the same in every jurisdiction?
I’m not a lawyer, and I have no idea how many worms are going to creep out of this can, but one thing I’m absolutely certain of is that it isn’t going to be simple to sort out, no matter how many people would like it to be or think that “common sense” is on their side.
. . . Wow. I’m going to be polite and assume that you never took physics in high school, instead of failing the unit on optics. Might want to bone up on that before you make an argument that deals with the physics of lenses, just sayin’.
You don’t appear to have much understanding of how the law operates either. It’s always complicated and difficult, and judges take a dim view of people who try to twist words around to mean something other than the contract defines them to mean.
Please explain to the lawyers in 1980 who wrote the contract for a published a book what precisely generative AI, Twitter and the internet are so they can be sure to account for their fair use in their contract… until five years ago none of us knew what this stuff would do. And, I’d mention, that Google Books has been pummeled by lawsuits for pretty much the same reason and ended up needing to pull almost all books from their free reading section.
There is a massive difference between AI tech in the 70s and today. The scale we’re able to achieve is orders of magnitude beyond what was dreamed of. These modern issues were conceived as taking much longer to arrive and giving the legal system more time to catch up. Our legal system can force a common baseline of behavior on our new technology and that will be necessary to have a healthy balance of power.
deleted by creator
The thing is, AI is not a “who”, it’s a “what”. A license that permits human beings to read a thing may not be sufficient to authorize AIs to read a thing.
That’s what the legal system needs to thrash out: is feeding information to an AI a separate right that needs to be assigned specifically by contract or license, or is it a subset of the human right to read a published document that the human has legitimate access to? If it is separate, then any work outside the public domain that doesn’t specifically have an authorization for AIs attached is going to be off-limits as training data, because rights not specifically assigned are reserved unto the copyright holder.
Given the speed with which the law typically moves, it’s going to be years before we have an answer.
deleted by creator
“Reading” is the act of viewing the unaltered text. It’s possible for a LLM to have text in its training set that no human has ever viewed in its original form. So no, a LLM is not equivalent to a pair of glasses in this case.
In effect, a LLM is a fancy datanbase that spits back modified subsets of its stored information in response to queries. Does this mean that the original data was “stored in a retrieval system” without permission (specifically forbidden by British copyright law, I believe, and possibly others)? Is the LLM creating “unauthorized derivative works”, which is illegal in many countries? Just where is the line between a derivative work and a work “inspired by” another, but still legal, anyway? Is it the same in every jurisdiction?
I’m not a lawyer, and I have no idea how many worms are going to creep out of this can, but one thing I’m absolutely certain of is that it isn’t going to be simple to sort out, no matter how many people would like it to be or think that “common sense” is on their side.
deleted by creator
. . . Wow. I’m going to be polite and assume that you never took physics in high school, instead of failing the unit on optics. Might want to bone up on that before you make an argument that deals with the physics of lenses, just sayin’.
You don’t appear to have much understanding of how the law operates either. It’s always complicated and difficult, and judges take a dim view of people who try to twist words around to mean something other than the contract defines them to mean.
deleted by creator
Please explain to the lawyers in 1980 who wrote the contract for a published a book what precisely generative AI, Twitter and the internet are so they can be sure to account for their fair use in their contract… until five years ago none of us knew what this stuff would do. And, I’d mention, that Google Books has been pummeled by lawsuits for pretty much the same reason and ended up needing to pull almost all books from their free reading section.
deleted by creator
There is a massive difference between AI tech in the 70s and today. The scale we’re able to achieve is orders of magnitude beyond what was dreamed of. These modern issues were conceived as taking much longer to arrive and giving the legal system more time to catch up. Our legal system can force a common baseline of behavior on our new technology and that will be necessary to have a healthy balance of power.
deleted by creator