I’m syncoiding from my normal RAIDz2 to a backup mirror made of 2 disks. I looked at zpool iostat and I noticed that one of the disks consistently shows less than half the write IOPS of the other:

                                        capacity     operations     bandwidth 
pool                                  alloc   free   read  write   read  write
------------------------------------  -----  -----  -----  -----  -----  -----
storage-volume-backup                 5.03T  11.3T      0    867      0   330M
  mirror-0                            5.03T  11.3T      0    867      0   330M
    wwn-0x5000c500e8736faf                -      -      0    212      0   164M
    wwn-0x5000c500e8737337                -      -      0    654      0   165M

This is also evident in iostat:

     f/s f_await  aqu-sz  %util Device
    0.00    0.00    3.48  46.2% sda
    0.00    0.00    8.10  99.7% sdb

The difference is also evident in the temperatures of the disks. The busier disk is 4 degrees warmer than the other. The disks are identical on paper and bought at the same time.

Is this behaviour expected?

  • themoonisacheese@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    3
    ·
    3 months ago

    I’m not fully familiar with the overheads associated with all things going on on a chipset, but it’s not unreasonable to think that this workload, plus whatever the chipset has to do (hardware management tasks mostly), as well as the CPU’s other tasks on similar interfaces that might saturate the IO die/controller, would influence this.

    B350 isn’t a very fast chipset to begin with, and I’m willing to bet the CPU in such a motherboard isn’t exactly current-gen either. Are you sure you’re even running at PCIe 3.0 speeds too? There are 2.0 only CPUs available for AM4.

    • lightrushOP
      link
      fedilink
      English
      arrow-up
      1
      ·
      edit-2
      3 months ago

      B350 isn’t a very fast chipset to begin with

      For sure.

      I’m willing to bet the CPU in such a motherboard isn’t exactly current-gen either.

      Reasonable bet, but it’s a Ryzen 9 5950X with 64GB of RAM. I’m pretty proud of how far I’ve managed to stretch this board. 😆 At this point I’m waiting for blown caps, but the case temp is pretty low so it may end up trucking along for surprisingly long time.

      Are you sure you’re even running at PCIe 3.0 speeds too?

      So given the CPU, it should be PCIe 3.0, but that doesn’t remove any of the queues/scheduling suspicions for the chipset.

      I’m now replicating data out of this pool and the read load looks perfectly balanced. Bandwidth’s fine too. I think I have no choice but to benchmark the disks individually outside of ZFS once I’m done with this operation in order to figure out whether any show problems. If not, they’ll go in the spares bin.

      • themoonisacheese@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        2
        ·
        3 months ago

        Oh wow congrats, I’m currently in the struggle of stretching an ab350m to accept a 4600G and failing.

        You’re right, you should hit PCIe 3 speeds and it’s weird, but the fact that the drives swap speeds depending on how they’re plugged in points to either drivers or the chipset.

        • lightrushOP
          link
          fedilink
          English
          arrow-up
          1
          ·
          edit-2
          3 months ago

          On paper it should support it. I’m assuming it’s the ASRock AB350M. With a certain BIOS version of course. What’s wrong with it?

          • themoonisacheese@sh.itjust.works
            link
            fedilink
            English
            arrow-up
            2
            ·
            3 months ago

            It’s a gigabyte ab350m gaming-3 rev 1.0. it boots grub fine but then crashes right after displaying “loading Linux 6.x”, CPU led flashes then dram led stays on, I have to turn it off with the PSU switch.

            Either it’s a rev 1.0 bug which is a thing on those motherboards, or the CPU (or igpu) is defective.

            https://superuser.com/questions/1854228/proxmox-doesnt-boot-after-cpu-change

            I’m currently waiting on support from both the seller and gigabyte but I don’t expect anything out of it, though I’m still yet to test it in a different motherboard.

            • lightrushOP
              link
              fedilink
              English
              arrow-up
              1
              ·
              edit-2
              3 months ago

              Iiinteresting. I’m on the larger AB350-Gaming 3 and it’s got REV: 1.0 printed on it. No problems with the 5950X so far. 🤐 Either sheer luck or there could have been updated units before they officially changed the rev marking.

              • themoonisacheese@sh.itjust.works
                link
                fedilink
                English
                arrow-up
                1
                ·
                2 months ago

                Sorry I 'ever saw this, that sucks.

                Turns out mine was broken too. I put the CPU in my gaming rig and it worked fine, so I bought a new motherboard and the problem is gone.