Why I hate Sprint, reason #4523

I was on a conference call, and I had to switch from my cell phone to Skype because the call was breaking up too much on the cell call. Yes, it’s a pretty incredibly sad state of affairs when Skype provides a clearer, less broken up signal than your cell phone!

Only 1 year and 6 months until this contract is over and we can switch to back to AT&T or over to Verizon.

Today’s rather inconvenient discovery

If you use rsync to backup your system, and the system you’re backing up to has different uids for some userids, it converts them as it stores! I just found this out because after restoring my xen1 backup, I’ve discovered that all my postgres files belong to 114, which is the uid of postgres on my home server, not on xen1.

This is going to make restoring all the xen backups a royal pain in the ass.

More server setup crapola

I tried disabling the RAID controller, and when I go to boot it tells me that I don’t have any drives. So I re-enabled it, and it told me I didn’t have any logical drives. Also, sometimes when I boot the RAID controller BIOS tells me there are no drives, and sometimes it shows me the drives. I tried yanking the RAID daughter card entirely, but it’s got a couple of plastic offsets that it doesn’t want to come off of, and I’m reluctant to try anything that I can’t undo at this point. So I’m using the RAID controller to create 4 “Arrays” of 1 disk each. We’ll see how that goes.

Oh, that isn’t good.

I was trying to tar a bunch of stuff off a USB backup disk onto the new machine, and it suddenly started throwing all sorts of errors and couldn’t read any drives, not even the root drive to find the shutdown command.

First thing I’m going to check is moving the drives around, because I accidentally put the two new drives in the third and forth slots instead of one and two, so I’m going to fix that. If that doesn’t help, then I’m going to just turn of the Adaptec RAID controller and try a software RAID. If that doesn’t work, I don’t know what I’m going to do. Probably return the hardware and start again.

IT’S ALIVE!

The GoogleBox lives!


Yes, after 4 days of downtime, my illustrious yellow 1U server has been revived from the dead. After it died, I asked the colo provider to power cycle it, and they said it didn’t come back up. I asked them to yank the box, and I picked it up and brought it home. Expecting a power problem, I first tried yanking all the hard drives and memory, but even then didn’t get any beeps or other activity. So then, I tried yanking one of the CPUs. I must have gotten lucky, because removing the #2 CPU got me a couple of POST beeps, and when I put back the memory and the hard drives, it booted just fine.

I’ve had this box since January 2007, and the CPUs are dated from 2001, so I guess it’s time to replace it. I ordered a slightly newer box off eBay that has twice as much memory and 4 SATA drive bays. With two 1Tb drives, it will have much more disk space, but more importantly the empty drive bays mean that if I need to expand, I can add newer bigger drives when they become available. I’m considering using software RAID to mirror the two drives, because even 1Tb is bigger than what I have now and I’m not hurting for space. And with lvm, I can plonk in two new 2Tb drives when the time comes, migrate the volume groups to the new drives, and yank the old ones. All the remains now is to decide whether to build a new OS and get everything working on it, or just restore from a backup and continue the upgrade path.

While I had the box home for a few days, I took the opportunity to do a long-delayed upgrade from Debian “etch” to Debian “lenny”. I didn’t want to tackle that remotely because there was a significant chance (and it happened several times) that I was going to get it into the situation where I needed to intervene at the Grub stage, and I couldn’t do that remotely (because the cheap colo facilities don’t give you remote boot consoles like the expensive ones do.) The biggest hassle of this upgrade was that I had to do some messing around to get a console to appear, changing the boot options on the box itself, and also the getty lines in inittab of each of the guest “domU”s. The second biggest hassle was that I had to install “udev” on all the guests so that ssh could work. Also while they were home I took the opportunity to make a backup of the whole thing, including the guests that don’t belong to me. Normally I just back up my own. That should make setting up a new box a lot easier.

I got all the fixing and upgrading and backing up done early this morning. I brought my box over to the colo company office at 10:15. And waited. And waited. And waited. I had a “ping -a” running so I’d know as soon as it came back. And I waited some more. The business office is the other side of town from the colo facility, so I figured there would be some delay. The company that used to own those racks would let me go out to the facility with them, but these guys bought the business from that company and they’re anal about security and won’t let me go. Well, it turns out that their scheduled visit to the colo facility was at 10pm – nobody told me that, of course, until after I’d started panicing that they’d all gone home for the weekend without racking my box. But here it is – they racked the box, phoned me to say it was powering up, and now I’m connected again. Hallelujah!