I have quite an extensive collection of media that my server makes available through different means (Jellyfin, NFS, mostly). One of my harddrives has some concerning smart values so I want to replace it. What are good harddrives to buy today? Are there any important tech specs to look out for? In the past I didn’t give this too much attention and it didn’t bite me, yet. But if I’m gonna buy a new drive now, I might as well…

I’m looking for something from 4TB upwards. I think I remember that drives with very high capacity are more likely to fail sooner - is that correct? How about different brands - do any have particularly good or bad reputation?

Thanks for any hints!

  • Avid Amoeba
    link
    fedilink
    English
    arrow-up
    2
    ·
    edit-2
    2 months ago

    Not that I want to push ZFS or anything, mdraid/LVM/XFS is a fine setup, but for informational purposes - ZFS can absolutely expand onto larger disks. I wasn’t aware of this until recently. If all the disks of an existing pool get replaced with larger disks, the pool can expand onto the newly available space. E.g. a RAIDz1 with 4x 4T disks will have usable space of 12T. Replace all disks with 8T disks (one after another so that it can be done on the fly) and your pool will have 24T of space. Replace those with 16T and you get 48T, and so on. In addition you can expand a pool by adding another redundant topology just like you can with LVM and mdraid. E.g. 4x 4T RAIDz1 + 3x 8T RAIDz2 + 2x 16T mirror for a total of 44T. Finally, expanding existing RAIDz with additional disks has recently landed too.

    And now for pushing ZFS - I was doing file based replication on a large dataset for many years. Just going over all the hundreds of thousands of dirs and files took over an hour on my setup. That’s then followed by a diff transfer. Think rsync or Syncthing. That’s how I did it on my old mdraid/LVM/Ext4 setup, and that’s how I continued doing on my newer ZFS setup. Recently I tried using ZFS send/receive which operates within the filesystem. It completely eliminated the dataset file walk and stat phase since the filesystem already knows all of the metadata. The replication was reduced to just the diff file transfer time. What used to take over an hour got reduced to seconds or minutes, depending on the size of the changed data. I can now do multiple replications per hour without significant load on the system. Previously it was only feasible overnight because the system would be robbed of IOPS for over an hour.

    • Pacmanlives@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      2 months ago

      I wonder if that’s a new feature. IIRC the issue was with vdevs in ZFS in the pool expansion. I am a FreeBSD user and do have some jails running. I do like ZFS a lot it’s way more mature then BTRFS on the Linux

      • Avid Amoeba
        link
        fedilink
        English
        arrow-up
        1
        ·
        edit-2
        2 months ago

        As far as I can tell it dates back to at least 2010 - https://docs.oracle.com/cd/E19253-01/819-5461/githb/index.html. See the Solaris version. You can try it with small test files in place of disks and see if it works. I haven’t done it expansion yet but that’s my plan for growing beyond the 48T of my current pool. I use ZFS on Linux btw. Works perfectly fine.