• dual_sport_dork 🐧🗡️@lemmy.world
      link
      fedilink
      arrow-up
      29
      arrow-down
      1
      ·
      5 months ago

      Index the content of the ads, identify it, and drop that data from the served video file? There may be a more clever solution, but that’d definitely work. It should be possible to checksum or just straight up store the data for the first couple of kilobytes of video data that would uniquely identify each ad.

      Youtube obviously must have a rota of however many ads which they can display, so eventually they’d all get identified although you’d be playing whack-a-mole forever as they release new ones. Isn’t Sponsorblock partially crowdsourced anyway?

      This would be challenging and fairly expensive, but worth it if you were motivated by sufficient spite.

      • atocci@lemmy.world
        link
        fedilink
        arrow-up
        5
        arrow-down
        1
        ·
        5 months ago

        They say the ad is being integrated straight into the video stream on the server side though. It won’t be its own identifiable piece of data on the client side anymore.

        • dual_sport_dork 🐧🗡️@lemmy.world
          link
          fedilink
          arrow-up
          19
          arrow-down
          1
          ·
          5 months ago

          Yes it will? The video stream is handed from the server to your browser or device. Once it arrives, your machine can do whatever it likes with it. Up to and including deliberately ignoring part of the data, and since Youtube videos are buffered your client can skip to whatever part of the video is past the ad provided it’s been buffered that far.

          • atocci@lemmy.world
            link
            fedilink
            arrow-up
            6
            ·
            5 months ago

            But how? Unless I’m misunderstanding how video encoding is done, you shouldn’t be able to reliably identify what’s an ad vs what’s actual video once it starts getting mixed together. The ad will be encoded differently for every video it’s inserted into.

            I could be completely wrong about this, but the same ad clip’s data should end up looking completely different depending on any number of things.

            • dual_sport_dork 🐧🗡️@lemmy.world
              link
              fedilink
              arrow-up
              16
              ·
              edit-2
              5 months ago

              Most encoding formats are deterministic, including the VP8/VP9 codec that Youtube uses. I imagine they could deliberately insert some manner of randomization in there if they really wanted to, and if they intend to carry through with this plan they may have to. But the same input with the same encoder (and settings) should produce the same output every time, at least if you begin counting from a keyframe.

              Even if it can’t be identified on a binary level with clever tactics, which I think it will be unless they do some kind of picture-in-picture thing, it should be trivial with current hardware identify it even with a fairly crude optical recognition system and a database. I.e., sample N number of points on the output and gauge the average RGB data for each for a couple of frames, and if that matches our entry for the ad in our crowdsourced database, skip ahead X seconds based on the database. Even better if you did it on the keyframes.

              Doing it based off of the audio of the ad should be even easier, since acoustic fingerprinting is a pretty cheap technology to implement these days.

              The other question will be if Youtube is dumb enough to always insert the same type of ads in the same place in each video, which they may be at least to start with, so a very simple table of “skip X amount of time at Y timecode on Z video” would be feasible. Or even better, if they hard insert the ads into the video to save on processing time, such that they never change. Are they going to try to insert ads and encode video to serve to individual users in realtime? Doubt it. That’d be bonkers. Youtube already chews on uploaded videos for sometimes upwards of an hour before having them ready to serve… I don’t think they’re ready to commit to and pay for the compute power to try to pull a stunt like this in realtime.

              All of this is going to require some manner of crowdsourcing, unless we get really good at using AI against them or something (which’d be immensely satisfying, come to think of it).

            • El Barto@lemmy.world
              link
              fedilink
              arrow-up
              7
              ·
              5 months ago

              If a song can ne fingerprinted (e.g. Shazam), so can ads. Even when they’re part of a larger video.

            • Voyajer@lemmy.world
              link
              fedilink
              arrow-up
              5
              ·
              edit-2
              5 months ago

              Twitch does the same thing but you can still circumvent it. Worst case users may need a VPN to a country that doesn’t have many ads.

          • deweydecibel@lemmy.world
            link
            fedilink
            arrow-up
            2
            arrow-down
            3
            ·
            5 months ago

            What part of the data?

            The whole point of this is they want to meld the ad data with the content in such a way that there are no identifiers anymore.

            If what you’re suggesting were possible, they wouldn’t be bothering with this.

            • dual_sport_dork 🐧🗡️@lemmy.world
              link
              fedilink
              arrow-up
              6
              ·
              5 months ago

              Define “meld.”

              If they’re encoding the ads and the content into the same video stream, which appears to be the proposal, your client still has access to the entire video stream and in fact must do so in order to play it.

              Even if you’re not going to be able to identify an ad on the raw binary level, and my proposal to do that was just spitballing anyway, the world is just absolutely chock-a-block full of audio and video content identification technologies that could be co-opted to identify specific ads, at which point your client could simply not play the section of the video stream containing them.

            • El Barto@lemmy.world
              link
              fedilink
              arrow-up
              3
              ·
              5 months ago

              If what you’re suggesting were possible, they wouldn’t be bothering with this.

              You’re giving Google waaaay too much credit.

              They tried other methods prior to this, and failed. So they thought those methods were effective, and they totally bothered implementing them.

      • subtext@lemmy.world
        link
        fedilink
        arrow-up
        2
        ·
        5 months ago

        Except with AI what’s to stop the advertisers from dynamically generating ads on the fly that are just ever so different from the original so as to throw off this kind of blocking.

      • Auli
        link
        fedilink
        arrow-up
        1
        arrow-down
        7
        ·
        5 months ago

        This is not feasible.

      • Aux@lemmy.world
        link
        fedilink
        arrow-up
        2
        arrow-down
        1
        ·
        5 months ago

        Do you really want your ad blocker to do a resource intense image detection over a video stream in real time? Your PC will start fucking fuming.

        • Kushan@lemmy.world
          link
          fedilink
          arrow-up
          6
          ·
          5 months ago

          This is exactly what will happen.

          Diff the same video a few times and you’ll be able to figure out which is injected content and which isn’t.

          Separate out the injected content and you can fingerprint that content like how Plex or Emby fingerprints intros to TV shows (i.e. it’s a solved and known problem).

          Then you can reliably identify the injected and content, you know how long it is and can just tell the client to skip it.

          This won’t be easy, it’ll require more than folks indexing ad content but it’s feasible.

        • ByteWelder@lemmy.ml
          link
          fedilink
          arrow-up
          3
          ·
          edit-2
          5 months ago

          Users can mark videos and submit that content. Users can vote on other users’ marking of content. It won’t work if YT streams the ads in if they randomly change the timestamp at which the ad(s) start.

          • El Barto@lemmy.world
            link
            fedilink
            arrow-up
            1
            ·
            5 months ago

            Yup! Oh, I know how sponsorblock does it, but the question was more about highlighting that it’s theoretically possible. Unless they do what you describe.