See linked posting. I’ve commented there with a link to a CLI tool in Python that allows downloading of IA collections. I’ve submitted a patch to enable specifying start and end points so that it’s easier to resume downloading a huge collection, or to allow multiple people to split up the work.

https://archive.org/details/georgeblood

https://archive.org/details/78rpm_bowling_green

F*ck the RIAA and absurdly long copyright.


EDIT: There is more than one collection of 78s on IA, so I updated the title.


The issue with these collections are that they’re absolutely HUGE. And yes, IA offers torrents for them, but as a separate torrent for every. single. album. And the torrents have all data in them – FLAC, fixed-rate MP3, VBR MP3, PDF liner notes, etc. etc… there may be some extremely hardcore data-hoarders out there who want everything, but IMHO as these are scratchy old 78 records, FLAC is overkill to just save the audio in a listenable format. The George Blood collection, just the VBR MP3s, is looking to be about 6TB. With ALL data it might be over 40TB! I can’t afford that many hard drives :)


So, my approach at the moment is to save just the VBR MP3s (they seem to be done at up to 320kbps VBR) and the JPEG album cover. If I have a chance and any storage left afterwards, I can make a separate pass to get the album liner PDFs…


Tool used: https://github.com/jjjake/internetarchive


Patch to allow setting start and end item indices for downloads: https://github.com/jjjake/internetarchive/pull/605


Example usage to grab just the VBR MP3 and record label JPG for each (note the --start-idx and --end-idx arguments):

#ia download --start-idx=4001 --end-idx=8000 -a -i --format="VBR MP3" --format="JPEG" --search collection:georgeblood

I’m going to concentrate on the George Blood collection for now… I’m starting at item 1. It would be great if others started at index 50,000, 100,000, 150,000, … and others started at the end and worked backwards in similarly-sized chunks, so that it’s assured someone gets each of them.

  • Grimpen
    link
    fedilink
    English
    arrow-up
    17
    ·
    1 year ago

    Or a renewal step. If it’s not worth renewing, let it into the public domain.

    This is why It’s A Wonderful Life became a Christmas classic. Because it was in the pubic domain, it was used as late night filler.

    The MPAA and RIAA miss the point. If It’s A Wonderful Life was still copyrighted, it wouldn’t have become a classic.

    It’s like the concept of Abandonware. If video games had a large copyright clearing house like the MPAA or RIAA, Abandonware wouldn’t work, but abandoned media will disappear. Heck, non-abandoned media also disappears because profits don’t reward preservation.

    • GnuLinuxDude@lemmy.ml
      link
      fedilink
      English
      arrow-up
      3
      ·
      1 year ago

      Ok but then how will my kids record company benefit into perpetuity?

      In all seriousness, I think copyright law is the best example of how captured our government is to large corporate interests.