Yesterday I went out to the colo to put another hard drive in my 1U box. I’ve shut down my box about 3 times now, and one of the times the third domU got a corrupt disk and had to be wiped and reinstalled. That’s why I tried so hard to make sure that all the disks got mounted as ext3 (with journalling) instead of ext2 (no journalling). This time, just to make sure, I used the “xm shutdown [domU name]” on all three domUs before I shut down the box, just to make sure they shut down cleanly.
It took a bit of struggling to get the second drive working – I had to jumper the drives as master and slave instead of cable select, and the 80 pin cable I brought along didn’t quite stretch from one to the other so I had to stick with the existing 40 pin cable. But other than that, it seemed like everything went fine.
Until I got an email from the owner of the third domU. He couldn’t log in. So I tried the “xm console”, and saw
xm console xen3
attempt to access beyond end of device
hda1: rw=0, want=1357711368, limit=104857600
attempt to access beyond end of device
hda1: rw=0, want=18058643056, limit=104857600
attempt to access beyond end of device
hda1: rw=0, want=2123850752, limit=104857600
attempt to access beyond end of device
and then it would prompt for a userid but never prompt for a password.
I shut down his domU and did an fsck on his lv, and it reported dozens if not hundreds of errors. It boots now, but I’m scared that it’s going to do this again.
If you’d picked up the long jump module before leaving Lambda Core, you’d be finding Xen much less trouble.