I have some good PDF ebooks I’m willing to share, but I suspect the seller embeds some tracking data in them to link them to my account, as every time I download them from the official website they have a different hash while being visually identical. The same when checking against the copies a friend bought from the same seller. Since I dont wanna get banned, can you recommend a way to remove that stuff?

  • Shizu@lemmy.world
    link
    fedilink
    English
    arrow-up
    2
    ·
    1 year ago

    Why would the checksum differ between downloads if there was a watermark with user identifiable data

    • bionicjoey
      link
      fedilink
      English
      arrow-up
      4
      ·
      edit-2
      1 year ago

      Just checked one of my Paizo pdfs and in addition to my account name and email address it also has the datetime that I downloaded the pdf written in the watermark. Presumably because they append the file creation time when the pdf is being signed

      • Shizu@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        ·
        1 year ago

        Fair, then reprinting won’t help. I’d go ahead and come up with some Python script which exported all pages as png, edited that specific portion of every image and recompile it to a pdf. I’m not sure if there is a too which could already do that out-of-the-box.

        • bionicjoey
          link
          fedilink
          English
          arrow-up
          2
          ·
          1 year ago

          Unfortunately then you lose things like text and links. I think the only real solution for my specific example (which to be clear, might not be OP’s dilemma) is to crack and directly edit the binary data of the PDF file