• BlackCentipede@lemmy.ml
    link
    fedilink
    arrow-up
    0
    arrow-down
    2
    ·
    edit-2
    4 years ago

    It’s 50/50 for me based on my anecdote experience as a small business owner.

    I used to run production server on RHEL 7 until the server AMD EPYC 7452 had issues with RHEL 7 provided kernel and packages that disrupt the functionality of the server. (We’re talking HARD FREEZES that killed the server and forcing you to power it off and reboot it. RHEL support refused to fix it or disable Packagekitd which is also another reason why server crashes and they won’t fix cockpit server.)

    With the latest hardware like EPYC servers, we had to choose the latest version of Linux, we tried RHEL 8 and that didn’t work either. So frustrated that I’ve try doing the right thing by choosing stable Linux distro, I opt to go “my way” on it and do things differently while being aware of the risks involved.

    So we opted for Arch Linux (shocking I know), and it worked out of the box from there since we minimize the risks with a few key factors:

    1. Reduce Number of dependencies
    2. Organizing an Upgrade-cycle to monthly cycle
    3. Regular Snapshot/Backup
    4. Versioned the compiler/runtime

    There are few advantages to this that I can reduce amount of space that software take to deploy to server from like 2.8 gb to only 240 mb deployment, because I don’t have to static link as often since I know the server and my workstation are at relatively similar version especially with LLVM dependencies and compilers. With a given on minimal dependencies that boil down to updating systemd, bash, coretools, Linux kernel, and other base-system dependencies and the chosen dependencies for what my project required, the risk factor of server failing become very minimal as opposed to RHEL out of the box distro.

    As for upgrading/updating the server, before we actually conduct an upgrade for production servers, we run a test on them in a KVM that emulate the EPYC server processors and run various tests to ensure that it’s fully functional after upgrade including PostgreSQL upgrades. When doing an upgrade on Production Server, we just simply do a full incremental backup and then do an update from there and do a reboot about every month which prompts the server to have about 99.98% availability for JUST that server since it takes about 10 minutes to reboot after running a RAM check and other boot sequences. The overall service is actually closer to 99.999999% to 100% availability, because we’re not running ONLY that server, but running 3 more other servers with load balancing that can pick up the stack if other server failed.

    There are some dependencies that definitely require careful consideration like PostgreSQL which require an upgrade plan and prior testing on virtualized environment. I only upgrade PostgreSQL once a year or so and it tend to work fine for quite a while.

    • gorugorugo@lemmy.ml
      link
      fedilink
      arrow-up
      0
      arrow-down
      2
      ·
      4 years ago

      This sounds awesome, thanks for the great write-up. People like to poo-poo on Arch as a server but it’s really such a great option so long as you have backups and are careful. The AUR is crazy handy too.