• @[email protected]
    link
    fedilink
    238 months ago

    I don’t care what works a neural network gets trained on. How else are we supposed to make one?

    Should I care more about modern eternal copyright bullshit? I’d feel more nuance if everything a few decades old was public-domain, like it’s fucking supposed to be. Then there’d be plenty of slightly-outdated content to shovel into these statistical analysis engines. But there’s not. So fuck it: show the model absolutely everything, and the impact of each work becomes vanishingly small.

    Models don’t get bigger as you add more stuff. Training only twiddles the numbers in each layer. There are two-gigabyte networks that have been trained on hundreds of millions of images. If you tried to store those image, verbatim, they would each weigh barely a dozen bytes. And the network gets better as that number goes down.

    The entire point is to force the distillation of high-level concepts from raw data. We’ve tried doing it the smart way and we suck at it. “AI winter” and “good old-fashioned AI” were half a century of fumbling toward the acceptance that we don’t understand how intelligence works. This brute-force approach isn’t chosen for cost or ease or simplicity. This is the only approach that works.

    • @[email protected]
      link
      fedilink
      English
      38 months ago

      Models don’t get bigger as you add more stuff.

      They will get less coherent and/or “forget” the earlier data if you don’t increase the parameters with the training set.

      There are two-gigabyte networks that have been trained on hundreds of millions of images

      You can take a huge tiff of an image, put it through JPEG with the quality cranked all the way down and get a tiny file out the other side, which is still a recognizable derivative of the original. LLMs are extremely lossy compression of their training set.

      • @[email protected]
        link
        fedilink
        48 months ago

        which is still a recognizable derivative of the original

        Not in twelve bytes.

        Deep models are a statistical distillation of a metric shitload of data. Smaller models with more training on more data don’t get worse, they get more abstract - and in adversarial uses they often kick big networks’ asses.

        • @[email protected]
          link
          fedilink
          18 months ago

          No this will benefit capitalism and wealthiest people the most. The rest of us will suffer because of this. People can only think of the positives of AI and never the negatives this is weed all over again.

          • @[email protected]
            link
            fedilink
            68 months ago

            Motivation to discuss anything with you goes flying out the window, if you think ending marijuana prohibition is anything but positive for the common people. And you’re going to drop that turd in a completely unrelated punchbowl.

            • @[email protected]
              link
              fedilink
              -3
              edit-2
              8 months ago

              Marijuana is always characterized as positives and people always forget the negatives in every conversation. This is the exact same shit. Weed shouldn’t even be illegal but those dumb racist white men in the 60s-80s with their paranoia decided to outlaw it. Fuck the exact doctors and psychologists that “analyzed” it said everything was bullshit so they had a professional you dumbass too. I’m not getting into racist history with you but take my first sentence as the argument.