Google Says It'll Scrape Everything You Post Online for AI

misk@lemm.ee · 1 year ago

Google Says It'll Scrape Everything You Post Online for AI

Pete Hahnloser@beehaw.org · 1 year ago

People who are alive can have a company steal their entire corpus without recompense, while the descendants of people who died decades ago can get still get paid for content created by their ancestors.

Right.

Peanut@sopuli.xyz · 1 year ago

But how else could Disney afford to own everyone else’s rights and properties? Why not think about the little guy! (Mickey mouse is little, right?)

That being said, I find it weird people are going after training data for llm’s after completely ignoring the models built specifically to compete with and take advantage of people’s unconscious habits and lifestyles.

AI in general will be very important to comfortably survive the near future as a species. Data is an important part of that.

we absolutely need to do something about the megacorps funneling every new gain as a society into increasing the already absurd wealth divide. The technology is good. The general web scraping isn’t bad if the tool is not specifically evil in function. We just need to as a global community demand that the technology be used to benefit everyone equally as it continues to be developed.

alcasa@lemmy.sdf.org · 1 year ago

Glad that I can contribute to making the next Google Bard even dumber

Zapp@beehaw.org · edit-2 1 year ago

Yeah. Now the stupidity I post online has a purpose.

Someday a T-800 will be closing in on a freedom fighter, but will have an intrusive thought interrupt it at a key vulnerable moment. And that intrusive thought will be some random pun we posted to DadJokes. You’re welcome, future freedom fighters.

Rentlar@beehaw.org · 1 year ago

I, as the proprietor of my comments, condone Google AI scraping my publicly shared content for their own use, on the condition that they condone scraping of their publicly accessible content including YouTube videos. :P

Thomas Gray@lemmy.dbzer0.com · 1 year ago

Google is going to continue boiling the frog until everyone using gmail, YT, drive, etc… is paying subscriptions for access to these services. It’s going to be interesting to see how much people are willing to pay to hold on to a gmail account they’ve been using for 20 years. I should buy Alphabet stock now.

CreativeTensors@beehaw.org · 1 year ago

I just kind of assumed that they, as well as anyone in the space was doing that already.

Whether that means that we all collectively have ownership over the outputs of these models if they’re trained on content that we produced over the years is another thing. As someone who uses AI tools a fair bit I would be totally fine with generated content being public domain unless a threshold for human intervention is met.

That threshold is where the messy legal work lies.

YuzuDrink@beehaw.org · 1 year ago

Would maybe be funny if a law were passed saying that you could only charge people for access to your AI content if you can prove that their own content wasn’t used to help train the AI…

MJBrune@beehaw.org · 1 year ago

This is absolutely not the case and absolutely illegal. How their lawyers allowed this is insane and some government body needs to smack down Google with a real penalty. Even scraping AGPL’ed code would technically require them to AGPL their entire AI as it should be seen as a derived work in the courts. How could an AI scrape and utilize something, creating works based on the code taken, and not be seen as derived? It’s insane.

abhibeckert@beehaw.org · edit-2 1 year ago

How their lawyers allowed this is insane

I’m pretty sure Google’s legal team knows a thing or two about copyright law. If they think this is fair use, then I’m inclined to believe it might be.

ilmagico@beehaw.org · 1 year ago

they just think they having tons of money to throw at a potential lawsuit means nobody will dare suing them.

Ashley · 1 year ago

Are they wrong?

AndrewZabar@beehaw.org · 1 year ago

It’s that they know that it’s more profitable to seek forgiveness than to ask permission.

Get sued? Hah; okay see you in court in ten years. Meanwhile, profit. They can do it over and over and it will always be beneficial.

There needs to be SEVERE sanctions for these violations. Like in the millions. That’s the only way they’d stop. They just don’t care at all.