1. 8

  2. 6

    This seems like a bad compromise.

    No computer should need to be powered off to replace a disk in a RAID set (SATA and SAS are electrically hotswappable, and the majority of controller chipset drivers actively deal with hotswapping; the ones that don’t can be manually rescanned anyway in my experience). If your chassis is such that you have to power it off to physically remove a drive, then it’s worth looking at fixing that, rather than introducing a second system with presumably the same issues. Regarding the RAID toolset, “having to re-learn” the RAID management tools is just another way of saying “I didn’t document my tools properly”.

    Having an entire second system running as a redundant copy is expensive in terms of space and power. It also introduces it’s own risks, the obvious one is that there’s no way of detecting any silent bit errors, and any such errors will be silently copied to the redundant system. Modern RAID systems (even mdadm) will do proactive disk scanning and will pick up any silent bit errors and correct them from the RAID set if needed.

    This approach still doesn’t cover you against a double-disk failure: What if a disk in your primary fails, and at the same time a disk containing the same (or at least some of) the data in the secondary system fails? There is somewhat less risk here than a second disk in a single RAID5 set, but it’s still there. Just as RAID is not equivalent to a backup, neither is this approach.