After rebuilding the RAID on my colo box, the drive started reporting “50 offline uncorrectable sectors detected”. I figured I’d keep an eye on it and see if things get worse. A few weeks ago, I also started seeing “1 unreadable sectors detected”. Then a few days ago, the “50 offline uncorrectable sectors detected” went away, leaving just the “1 unreadable sectors detected”. Yesterday, I got a couple of “media error” reports and now it’s saying “2 unreadable sectors detected”.
That disk is a 1TB drive with 3 partitions – one swap, one used for /var/lib/xen (which is really only used on reboots), and one for the software RAID. I have a 2TB drive ready to go. What I’ve done is make 3 partitions exactly the same as the existing 3, and leave the rest of the disk open for expansion. I think what I should do is make the 4th partition of type “fd” (Linux raid autodetect) so if the first disk ever gets replaced with a 2TB disk I can add a second RAID to the first one.
I’m going to have to find some time to swap the drives. I wish I’d made note of which types of screwdrivers I need to open things up – unlike the old box, on this box the drives aren’t on convenient sleds.
I hope you’ve done some burn-in tests on the 2Tbyte disk; I’ve read far too many “dies after 1 week” type responses at NewEgg for virtually every brand that I’m cautious about migrating my RAID5 from 1Tb to 2Tb at this present time.
Last time I checked there are two “possible” problems when using software raid: 1.) drive encounters an “error” and takes “too long” to check so there software-raid decides “drive is bad” 2.) excessive load-unload-cycles (parking/unparking the head)
There is no silver bullet for both possible problems, except buying “raid edition 24/7” drives (I thing WesternDigital gives them a “RE” suffix) or buying hardware raid (I suppose Adaptec has a reason for maintaining the list of compatible drives).