I’m looking at my client, and have a few torrents that are between 99 and 99.8% finished and have been stuck like that for weeks. Anyone know of a good tool, AI or otherwise, that can “fill in” the missing bits? Wouldn’t that be cool?

  • Max-P@lemmy.max-p.me
    link
    fedilink
    English
    arrow-up
    7
    ·
    24 days ago

    No. It could repair some files to make them playable, maybe, by extrapolating sections before and after, like a couple seconds missing there and there in a movie, but all bets are off as to whether it’ll guess right. I’m not aware of such tool existing.

    But if it’s a zip file, there’s no chance it can fix it. It’s much different than AI upscaling, because you don’t just need to find an answer that’s close enough, you need the exact bits because even one value off could mean the gravity of the whole game is off, as an example. If some files are encrypted then all bets are off, as that would imply breaking encryption.

    Also I’d look at what’s the missing data. Sometimes you can be stuck at 99% because the only seeder left didn’t download a readme file or something but the whole content is there.

    • electric_nan@lemmy.mlOP
      link
      fedilink
      English
      arrow-up
      4
      ·
      24 days ago

      Yeah, these are video files, and they’re missing random bits within the file. I was just thinking that with such small gaps it wouldn’t be too hard to extrapolate something close to the missing data, and just “make them playable” as you said.

  • Randomgal
    link
    fedilink
    English
    arrow-up
    7
    ·
    24 days ago

    At first I was going to say “wtf this is stupid.” But is it though? Would it be possible for an sufficiently trained AI to just rawdog a bunch of binary? Idk to be honest.

    • Nollij@sopuli.xyz
      link
      fedilink
      English
      arrow-up
      2
      ·
      24 days ago

      You are comparing it to a hash, following some extra rules on what the data could be. You have exactly the length of hash before you can reliably count on duplicates (and collisions happen much sooner). In torrent v1, this is SHA-1, which has a 160-bit (or 20 byte) hash. Which means for every single additional random bit, you have doubled the number of possible matches.

      If your torrent has an uncommonly small chunk size of 256KiB, that’s 261,144 bytes. Minus the 20 from above, and you have a likely 256^261124 chunks that match your hash. That’s a number so large that Google calls it infinity. It would take you forever just to generate these chunks by brute force, since each would need to be created, then hashed, then the results stored somewhere. Many years ago, I remember someone doing this on CRC32 (32 bits/4 bytes) and 6 byte files. It took all night, and produced dozens of hash-matching files. You’re talking many orders of magnitude bigger.

      But then what? You’d still need to apply the other rules on what the data could be. Rules that are probably more CPU-intensive than the hash algorithm.

      The one trick that AI might be able to use to save the day is that it may contain in its corpus the original file. In effect, that would make the AI an unlikely seeder.

      • Randomgal
        link
        fedilink
        English
        arrow-up
        2
        ·
        24 days ago

        Yeah. That’s the same I’m thinking. Compiled binary code is unreadable to us, but it’s not random. It’s deterministic, so an AI should be able to complete it? Maybe?

  • egsaqmojz@lemmy.ml
    link
    fedilink
    English
    arrow-up
    2
    ·
    edit-2
    24 days ago

    basically youre asking the ai to generate chunks, then check those chunks. it might work

    edit: sorry i didnt provide a soln