Never trust the label

I’ve just wasted 4+ hours because I trusted the label that said that the CD our build-meister gave me had the latest build on it. I guess I trusted the build-meister too. I should have noticed that many of the RPMs said “3.6-006” instead of “3.6-007” like I was expecting.

Instead, I have to rebuild two systems (a CMS and a CP, as defined in the post the other day) back to RedHat 7.3 and version 3.3 of our software, configure it, burn a new DVD with CentOS 3.4 and version 3.6-007 of our software, and upgrade the two systems. See you in another 4 hours.

Oh, and did I mention that the air conditioning at work has one of its three chillers off-line, and has for the last three days, and so it’s hot and sweaty here?

What were they smoking?

Sometimes I’m forced to question the sanity of my cow orkers. If you run our setup program and choose the option to set the time and date, you are presented with a string like “062716452005.40” As near as I can figure, that’s DDMMHHmmYYYY.SS, or translated into English, day, month, hour, minute, year, period, seconds. Besides the utterly moronic order of the elements in the string, the input routine has absolutely no flexibility in what you can enter and no error checking. Get one character wrong or miss a column, and you’re going to get a date and time that are utterly unlike what you expected, and you won’t find out until you exit the setup program and type “date”.

Hmmm. How do I do this?

Ok, picture a network with one “controlling computer”, which I’ll call “the CMS”, and a bunch of satellite computers which I’ll call “the CPs”. These satellite computers live in projection booths in a theatre and have digital projectors hooked up to them, but that’s not important. The problem I’m dealing with is upgrading the machines from version 3.3 of our software to version 3.5. The software upgrade also necessitates an upgrade from RedHat 7.3 to CentOS 3.4.

I’ve got the upgrading of the CMS sorted (I have a non-bootable DVD with an apt repository with CentOS 3.4 and our software, and a kickstart file that does the upgrade without touching the partition with our data on it).

The CPs have hostnames of cp1 to cpN, and IPs of 192.168.30.101 and up. cp0 (192.168.30.100) is reserved.

What I’m working on now is upgrading the CPs. What I’ve been doing is making the CMS a PXE boot server, and wiping the boot partition on the CPs one at a time, re-installing them as cp0 and then when it comes back up, ssh-ing in and restoring the backed up configuration, including the hostname and IP.

The problem with that is that it takes 20 minutes per CP, and the powers that be are complaining that it takes too long. They’d like something more parallel.

So I’ve been thinking of retrieving the MAC addresses of each CP before I upgrade. Then I do them all in parallel, and use the MAC address afterwards to figure out which one is which. I understand that I can use “arp -a” to retrieve the MAC addresses. I’m wondering if there is something I can do to DHCP to give out the correct 192.168.30.1xx address to the right machine, or whether I should have DHCP hand out addresses in some other range, and then use “arp -a” again to find which machine has which address and fix them one at a time?

Damn you, Linode

For the second weekend in a row, my linode node has died. This time, the linode.com web site is down as well. From what I can glean from the linode IRC channel (which isn’t on linode), about half of their servers are dead to the world.

For me, that means no outgoing email, no mailing lists, and of course my hosted web sites including navaid.com are all down. This sucks.

Last week’s outage was caused because some clueless tech at ThePlanet, which is the colo where their servers live, moved some power connections around (after being explicity told not to touch anything) and overloaded a power supply. That took several hours to resolve.

Update
It’s up again, after only 6 hours. Geez, this sucks.

Getting spammed in earnest now

After I moved my blog from MoveableType to WordPress, it seemed that the comment spammers couldn’t find my new blog. For a while there, it even seemed that referrer spam had dropped down to nearly nothing. But they’re back, with a vengeance. They’re still trying to spam my old blog which doesn’t exist anymore, and Maddy’s blog which doesn’t accept comments any more, but now they’re spamming my blog. Or attempting to, anyway. SpamKarma is catching them all, but right now it’s catching 20-30 comment spams a day.

It’s a frustrating waste of my time and resources. I don’t pay for disk space and network bandwidth so that these vandals can use it up.