Followup on the hard drive problem two days ago

I booted with the Western Digital Diagnostics floppy that I keep handy, and when I did a quick scan it said it found a problem and asked me if I wanted it to fix it. I did. Then I did the full scan, and it didn’t find any problems. So I booted into Linux single user mode. Couldn’t figure out how to fsck the drives that way, so I booted with the rescue disk. It wouldn’t let me unmount the drives that way, so I booted back into single user mode so I could do a “shutdown -rF now”, which forced it to do a fsck when it booted again. It didn’t find any problems. So now I’m continuing to run my backup process every two hours, and keeping an eye on it. 36 hours later, still no more disk errors.

I think I just adjectivized something that’s never been adjectivized before

After today’s “State of the Project” meeting, I was looking for a sentence to describe it, and I settled on “It looks like things are progressing in a clusterfuckian direction”.
Continue reading “I think I just adjectivized something that’s never been adjectivized before”

Not a good thing to wake up to

Email in my inbox this morning:


The following warning/error was logged by the smartd daemon:

Device: /dev/hda, 1 Currently unreadable (pending) sectors

For details see host’s SYSLOG (default: /var/log/messages).

You can also use the smartctl utility for further investigation.
No additional email messages about this problem will be sent.

Looked in /var/log/messages, and I find:


Jan 11 04:10:19 allhats kernel: hda: dma_intr: status=0x51 { DriveReady SeekCompl
ete Error }
Jan 11 04:10:19 allhats kernel: hda: dma_intr: error=0x40 { UncorrectableError }, LBAsect=105859162, sector=105859101
Jan 11 04:10:19 allhats kernel: ide: failed opcode was: unknown
Jan 11 04:10:19 allhats kernel: end_request: I/O error, dev hda, sector 105859101
Jan 11 04:10:22 allhats kernel: hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
Jan 11 04:10:22 allhats kernel: hda: dma_intr: error=0x40 { UncorrectableError }, LBAsect=105859162, sector=105859109
Jan 11 04:10:22 allhats kernel: ide: failed opcode was: unknown

Ok, 4:10 am was about when yum was installing new software, or maybe it was when the nightly backup was running, but I guess something was hitting a part of the disk that hasn’t been hit in a while.

I’ve started running the nightly backup every 2 hours. When I get home tonight I’m going to have to try the WDC diagnostics, and fsck to see if that will fix it. If it doesn’t, I’ve got a 180Gb drive on the shelf. That’s the one that didn’t get along well with Fedora Core 3. It passed the WDC diagnostics, so it was probably just an artifact of the way it was partitioned, so hopefully partitioning it and formatting it will fix whatever ailed it.

This could be a very long night.