Judge on Meta’s AI training: “I just don’t understand how that can be fair use”

Nemeski@lemm.ee · 3 days ago

Judge on Meta’s AI training: “I just don’t understand how that can be fair use”

lennee@lemm.ee · 3 days ago

it isnt

Ogmios@sh.itjust.works · 3 days ago

Lawyers are like the demons in Frieren. They just say whatever they think will be advantageous to them, even if they have no clue what that actually means.

msage@programming.dev · 3 days ago

Just like LLMs

Rai@lemmy.dbzer0.com · 3 days ago

The resemblance is uncanny.

Riskable@programming.dev · edit-2 3 days ago

I’m glad the judge is really only considering the use of AI. Because it’s obviously not copyright infringement to train AI with whatever TF materials you want. It’s the output that matters.

Copyright infringement can’t happen at all until the copyrighted material gets distributed somehow. If you make a thousand copies of a song at home you have not violated copyright law. If—however—you share that song with someone else (or the entire Internet) you just violated the copyright of whoever owned that song (assuming it’s just a regular track and not something licensed in some special way).

Is it ethical to pirate a billion books to train AI? I don’t know. They never really intended for humans to read them. So—to me—it’s not that much different than the “making a thousand copies of a song at home” point. They haven’t deprived the authors of anything at that point.

When the AI is used it may violate an author’s copyright but only if it’s close enough to the original work that a judge would say, “yeah, that’s definitely derivative.”

Randomgal · 2 days ago

So you think the people at fault are NOT the billion dollar corporations that stole form all of humanity to create and sell a for-profit product?

But instead the random people who use it?

Are you stupid or just a paid shill?

Riskable@programming.dev · edit-2 2 days ago

Ok let’s get this out of the way: Copying is not the same as stealing. Not in law or ethics.

So let me reword what you wrote to better represent what you’re saying:

So you think the people at fault are NOT the billion dollar corporations that copied much of humanity’s creative works into their servers to create and sell a for-profit product?

Let me ask you this: What is the actual consequence of copying something on to a computer? Loading it into RAM. Performing analysis on it. Doing whatever with that data, internally—without ever sharing it or creating a product or anything like that.

Imagine that AI doesn’t exist yet and some billion dollar company deems it worth their time to archive the entire Internet’s worth of copyrighted content. They don’t distribute. They don’t share it. They don’t even tell anyone.

What is the actual human consequence of that? There’s is none. No one was deprived of anything. They have misrepresented no one. They have not created anything at all. No one is reading it. No one is consuming it. It’s just sitting there—on a billion dollar corporation’s servers.

Now let’s change the scenario slightly: Suddenly Mega Corp decides to use it—internally. To analyse how all this content is related. They look at all the links and references within it in order to figure out how “cheese” related any given bit of content is. They announce the cheese search engine.

Is that a problem? They’re literally storing and indexing all the world’s content on their servers! They didn’t license it! They didn’t ask for permission!

What I’m saying—my argument in it’s purest form—is that it’s the use of the content that matters. How is it used? Is the use depriving someone of something? Do people lose access to cheese because of the existence of the cheese search engine?

Now let’s take it further: Mega Corp decides to transform the cheese data and allow people to request semi-random cheese recipes. Some of these recipes are nearly identical to patented and trademarked cheese products!

Do cheese makers now have a legal right to sue? Do they have an ethical argument to make?

Maybe.

What I’m saying is that merely collecting the data and screwing around with it is irrelevant. It’s not until that data is distributed somehow that matters. Because until that point it’s just bits on a machine somewhere—not impacting anything.

But instead the random people who use it?

Yes! If I make oil paintings for a living and someone asks me to copy someone’s copyrighted work it’s on me to make sure I don’t do that. Now think about it as a copier: Someone walks up to a Xerox machine and copies a book. Do we sue Xerox for providing that capability?

That’s what’s at stake here: Do we treat the AI like the artist or do we treat it like the Xerox machine?

General_Effort@lemmy.world · edit-2 2 days ago

A very worrying take by that judge, but it befits Trump’s America.

Market harm is supposed to mean the market for a particular work. For example, when everyone torrent a movie, then that movie will plausibly sell fewer copies. That means there’s less economic incentive to produce movies. That directly undercuts the purpose of copyright.

Me, I think we might be better off without expensive movies, if the price is a censorship infrastructure.

This judge seems to understand market harm to mean that incumbents lose market share. Well, that can happen when new technologies arise. Copyright is constitutionally limited to encouraging new developments. No law in any sector provides for a right to a market share. To the contrary, attempting to secure such a right may be a felony under antitrust laws.

TWeaK@lemm.ee · 3 days ago

It never was, for any of them. They claim fair use under “research”, but the very next step after determining the category is to consider the commerciality. Their research is not an academic pursuit in the public interest and they don’t publish their research data (because that would be incriminating); the entire activity is commercial product development. Such a venture is very clearly and obviously not fair use.

vivendi@programming.dev · edit-2 3 days ago

Uh huh. They’re going after the first major proponent of open source AI.

Unless they fuck sam altman until he bleeds this is a bullshit court

Technikus5@feddit.org · 3 days ago

Well, technically speaking, Llama is not really open source. Rumours are Meta is only trying to get everyone to call it open source so they can have an advantage in the European market, which is a lot more restrictive for closed source models than open source ones https://opensource.org/blog/metas-llama-license-is-still-not-open-source https://simonwillison.net/2025/Apr/19/llama-eu-ai-act/

But I do agree that all other big AI companies should be held to those same standards, and made to pay for every bit of prior work they use for training

General_Effort@lemmy.world · 2 days ago

Nah. That’s just propaganda. The copyright people make that stuff up to help their argument against fair use and to drown out the likes of the Internet Archive or the EFF.

In truth, these exceptions don’t matter for Meta. They also offer these models as chatbots; as a service. That brings back all those useless bureaucratic hoops. The exceptions would matter for small players, but the copyright industry has pretty much neutralized them anyway.

vivendi@programming.dev · 2 days ago

This is their most recent license which is yeah a dick move but they used to be more open

dgdft@lemmy.world · 2 days ago

Since the first Llama release and onwards, Meta has been exclusively licensed as weights-available with commercial restrictions.

It is in no way open-source in the classic sense, nor has it ever been.