- cross-posted to:
- [email protected]
- cross-posted to:
- [email protected]
Millions of articles from The New York Times were used to train chatbots that now compete with it, the lawsuit said.
Millions of articles from The New York Times were used to train chatbots that now compete with it, the lawsuit said.
These models can still be trained on data that they’re allowed to use, but I think that what we’re seeing is that the better LLM services are probably trained with shocking amounts of private data, whereas the less performant probably don’t use stolen data.
Textbooks are a big one that I suspect we’ll probably see a set of suits over. Particularly because they seem to be some of the most valuable training data.