More Linux techie bullcrap

My Linux box has a M2 NVME drive with standard stuff, but bigger files go on two hard drives in a RAID1 (mirrored). A few months ago, one of the drives fell off the RAID, which usually means it failed. Luckily when one drive fails in a RAID1, the content continues to be available.

At that time, I just bought a new drive, swapped it in for the old one, and then added it to the RAID and then it automatically resynchronized. No muss, no fuss, no bother.

But a few days ago, I noticed that the SMART monitor on the other hard drive was showing some errors. I guess it makes sense that if one drive fails after 6 years, it’s not too surprising that the other one does as well. It hadn’t failed off the RAID yet, but I figured I’d be proactive. I ordered a new hard drive, figuring it would be as simple as last time.

And of course, this time it wasn’t that simple. After swapping the drive, the box wouldn’t boot, because the BIOS didn’t recognize the M2 NVME drive as a bootable drive. It showed up in the NVME configuration menu of the BIOS, but I couldn’t add it to the boot menu. I tried the “update the BIOS over the network using a PXE boot, but it just hung up and didn’t work. But that option had put the PXE boot options in the boot priority menu, and wouldn’t let me add anything else until I disabled all boot options. So I fiddled a few BIOS configurations (I think I turned off support for some AMI special NVME mode and turned on some boot thing that I had no idea about), and rebooted, and this time it booted. But now the command I used last time, mdadm /dev/md0 --add /dev/sdb failed because the RAID was inactive. So after a bit of googling and a bit of experimenting, I found the combination that worked:

mdadm -A --run /dev/md0 /dev/sda
mdadm /dev/md0 --add /dev/sdb

The first command re-assembled the RAID with just the first drive in it, and the run option to make it make it active even though it doesn’t have both drives. The second command then adds the second drive. I couldn’t put both drives in the first command because when I tried it said that the new drive wasn’t formatted or whatever for RAID.

So now the RAID is happily resynchronizing and I’ve got no SMART errors showing up in my munin console. So all is right in the world. Now if only I could convert the RAID box I have on my Mac Studio from a RAID0 (interleaved for maximum speed) to a RAID5 (safer for long term storage) without paying $150 for a new license for SoftRAID.