Never trust the label

I’ve just wasted 4+ hours because I trusted the label that said that the CD our build-meister gave me had the latest build on it. I guess I trusted the build-meister too. I should have noticed that many of the RPMs said “3.6-006” instead of “3.6-007” like I was expecting.

Instead, I have to rebuild two systems (a CMS and a CP, as defined in the post the other day) back to RedHat 7.3 and version 3.3 of our software, configure it, burn a new DVD with CentOS 3.4 and version 3.6-007 of our software, and upgrade the two systems. See you in another 4 hours.

Oh, and did I mention that the air conditioning at work has one of its three chillers off-line, and has for the last three days, and so it’s hot and sweaty here?

What were they smoking?

Sometimes I’m forced to question the sanity of my cow orkers. If you run our setup program and choose the option to set the time and date, you are presented with a string like “062716452005.40” As near as I can figure, that’s DDMMHHmmYYYY.SS, or translated into English, day, month, hour, minute, year, period, seconds. Besides the utterly moronic order of the elements in the string, the input routine has absolutely no flexibility in what you can enter and no error checking. Get one character wrong or miss a column, and you’re going to get a date and time that are utterly unlike what you expected, and you won’t find out until you exit the setup program and type “date”.

Hmmm. How do I do this?

Ok, picture a network with one “controlling computer”, which I’ll call “the CMS”, and a bunch of satellite computers which I’ll call “the CPs”. These satellite computers live in projection booths in a theatre and have digital projectors hooked up to them, but that’s not important. The problem I’m dealing with is upgrading the machines from version 3.3 of our software to version 3.5. The software upgrade also necessitates an upgrade from RedHat 7.3 to CentOS 3.4.

I’ve got the upgrading of the CMS sorted (I have a non-bootable DVD with an apt repository with CentOS 3.4 and our software, and a kickstart file that does the upgrade without touching the partition with our data on it).

The CPs have hostnames of cp1 to cpN, and IPs of 192.168.30.101 and up. cp0 (192.168.30.100) is reserved.

What I’m working on now is upgrading the CPs. What I’ve been doing is making the CMS a PXE boot server, and wiping the boot partition on the CPs one at a time, re-installing them as cp0 and then when it comes back up, ssh-ing in and restoring the backed up configuration, including the hostname and IP.

The problem with that is that it takes 20 minutes per CP, and the powers that be are complaining that it takes too long. They’d like something more parallel.

So I’ve been thinking of retrieving the MAC addresses of each CP before I upgrade. Then I do them all in parallel, and use the MAC address afterwards to figure out which one is which. I understand that I can use “arp -a” to retrieve the MAC addresses. I’m wondering if there is something I can do to DHCP to give out the correct 192.168.30.1xx address to the right machine, or whether I should have DHCP hand out addresses in some other range, and then use “arp -a” again to find which machine has which address and fix them one at a time?

Damn you, Linode

For the second weekend in a row, my linode node has died. This time, the linode.com web site is down as well. From what I can glean from the linode IRC channel (which isn’t on linode), about half of their servers are dead to the world.

For me, that means no outgoing email, no mailing lists, and of course my hosted web sites including navaid.com are all down. This sucks.

Last week’s outage was caused because some clueless tech at ThePlanet, which is the colo where their servers live, moved some power connections around (after being explicity told not to touch anything) and overloaded a power supply. That took several hours to resolve.

Update
It’s up again, after only 6 hours. Geez, this sucks.

Six Approaches in Six Months

It’s that time again, time to reset the clock on my instrument currency. Just about every time I go on a flying trip, I manage to get some actual IFR en-route, usually only for a few minutes here and there, but in the last 6 months I’ve only done one real approach, and of course no holds. In order to stay instrument current you have to have done a hold and 6 approaches in the last 6 months, plus “intercepting and tracking course through the use of navigation systems” (which is pretty hard to avoid if you’ve done the other bits), and since I plan to fly up to Ottawa for Canada Day weekend, I need to be current in case I do get some weather.

I wasn’t interested in doing any non-precision approaches. Ottawa and Rochester both have ILSes and frankly the whole “currency” thing is more of an exercise in being legal than in safety. (Before our trip out to Mt. Holyoke this fall I’ll probably practice a few non-precision approaches because I don’t remember what approaches they have at Barnes Muni .)

So I filed ROC-GEE-ROC, and flew out to the Geneseo VOR and did a hold there. No problem, they assigned a hold on the airway I was already on, so the entry was dead simple. One turn around, and I was ready to come back in. I asked for the ILS 28 approach.

The controller descended me to 2,500 feet and vectored me for the approach course. My first approach wasn’t bad, but wasn’t great. Both horizontal and vertical I kept within 2-3 dots. And when I “broke out” at decision height I was a bit south of the runway.

The next two approaches went much better. Vertical I kept within the donut, and horizontal I went out a dot, or a dot and a half maximum. Although I still ended up a little bit south of the runway each time.

The next approach, I got a new controller. He vectored me further out, made me to descend to 2,100 on the final turn, and then asked me to keep my speed up. I did, and I actually did a pretty good approach. Kept it in the donut both horizontally and vertically almost the whole approach.

I’m not sure what went wrong in the last two approaches. Maybe it was the new controller (who kept giving me the descent at the last turn, but started turning me in nearer and giving me an abrupt turn-on), maybe it was the fact that I adjusted the DG for precession, maybe I was overconfident, and I was trying to fly them fast again, or maybe I was just bored and tired. But both approaches I was hitting 3 or 4 dots deflection horizontally both to the left and the right. And on the final one, I couldn’t get it slowed down for the landing. After I got touched down, I couldn’t even seem to put much pressure on the brakes, and very nearly decided to go-around. I ended up rolling into the overrun area on runway 28, which is 5500 feet long.