This is an automated archive made by the Lemmit Bot.
The original was posted on /r/datahoarder by /u/Refinery73 on 2025-01-30 15:16:23+00:00.
Hi everyone,
I’m sitting on a pile of a few hundred thousand PDFs from local government als city hall meetings from half the county.
I’m wondering what to do with it and like to discuss your opinions.
I was able to easily scrape them from the gov website and the files are public. I see archive value in them for city history and political studies. They are however created by a bunch of different cities and departments and lack any clear license. The robots.txt didn’t prohibit scraping but I don’t exactly own them. On the other hand it’s public government information. Not US-based so I don’t want to discuss about licensing of public documents but how you would approach this dataset.
I thought about ‘preservation first’ and ‘public interest’ so to create a torrent archive for each city and start seeding it. I’m not sure however if someone has a better idea.
There is no public archive for this and cities have been losing these left and right when changing platforms and not caring about migrating. For them the relevant file is some signed printout in some drawer. They just don’t care.