The bane of my existance

One of the worst tasks I’ve had at this job is working on the automatic upgrader. I hate doing it, because it’s not so much “programming” as it’s “cobbling together a bunch of system administration stuff”. I got it working as well as I can, but there are some various flakey problems in the way RedHat/CentOS works, as well as some dodgy Dell hardware, that I can’t make it work 100% of the time. I’ve written about it before. I get called in whenever something fails to try and forensically engineer what went wrong. Today’s fuckup was very similar to the one in that linked article – somebody started the upgrade before they went home at night, and somebody else came in in the morning and started it again. That left some things half installed and half upgraded, and some of the “cp” machines decided that they were being “plex built” (built from scratch in the manufacturing area) rather than upgraded, so they all made themselves into FRU (field replacement units) and shut down. Of course it took me nearly an hour to figure out what the idiots had done and how to fix it. And the upshot is that because these machines are now “bare” and physically powered down, somebody has to go out to the site and set them up. Oh, did I mention that the fuckup also caused all copies of the saved configuration for the entire site to be lost?

Current again

This morning Jim and I met at the airport to do some flying. Because I’d done a bit already yesterday, I let him go first. It’s always interesting flying with another pilot, because everybody does things differently. First difference – because this was a practice flight, he decided not to “cheat” with his GPS – and he actually hand-flew the whole time. Second difference – he decided not to pre-heat the engine, even though it was below freezing. Third difference, and this was a doozy – he overcranked the engine like hell. I’ve always been taught not to crank more than 4 or 5 blades at a time, but he cranked a good 20 or 25 blades. That just about killed the battery, and when he couldn’t get it started after a couple more short cranks (because that’s all it would do) he decided to pre-heat. We dragged out the pre-heat cart and heated it up, but then he put the cart away before trying again. The battery was still shot, so I dragged out the pre-heat cart again and used it to jump start the plane. It started in 2 blades that time, and so I got my first taste of getting in the plane while the prop was turning. I also clonked the back of my head really badly when I stood up while coiling the extension cord for the pre-heater cart and hit the hangar door. I have a big scab there now.

When he did the take-off, he used two notches of flaps like it was a short field take-off, and was airborne right off the hump that’s about 1/3rd of the way down Runway 7.

He went out to the Geneseo VOR and did the published hold for the Canadagua VOR-A approach. Or at least he tried – I don’t think he intercepted the inbound radial more than half a mile from the VOR once in three tries. The reason I “cheat” with a GPS was abundantly clear – each time round, even though he was south of the inbound radial, on the outbound radial he was still correcting to the south. Then when it came time to do the actual approach, he dialed the heading in wrong by 5 degrees on the VOR (even though it had been set right while he was in the hold). And yet, in spite of that, he managed to end up closer to the airport than I usually do when I do that approach. So maybe he knows something I don’t.

Then he came in to Rochester to do the ILS 4 a couple of times. Another difference between him and I: he slowed down to 90 knots for the approach – I like to do them at 110 to 120 knots, since an ILS is generally to a nice long runway and you never know when some kerosene burner will be breathing down your neck.

He did two, and both times he a fine job of holding the localizer, and a not quite as good job on the glide slope. But it was bumpy and it’s easier to criticize than to do.

I was a bit surprised when he requested a circle to land on runway 7 and a full stop for his second ILS. I thought he was going to do a full 6. But he’d had enough and it was my turn. I decided to skip the hold and the non-precision approach, and just do 4 ILSes to get current. And in spite of the bumps and everything, I think I did pretty good on them. They kept turning me onto the localizer about 2 miles from the outer marker, and sometimes I wasn’t even properly established by the time I got there. One time they didn’t switch me over to the tower, leaving me on the approach frequency right the way down to decision height. Another time, I heard the approach controller about to give a regional jet behind us a speed restriction and then change his mind, and then the tower controller cleared us for “the option”. If we’d taken the option and done a stop-and-go, I wonder what would have happened to that regional jet?

By the time I’d finished my 4 ILS 04s, I was well and truly finished. The bumps weren’t as bad as yesterdays, but there is only so much bumping around at low altitude wearing foggles you can take. I’m glad that’s over, and hopefully I can get some real approaches and stay current that way.

Well, I’m not going to get current like that…

I need to get IFR current again. I let 6 months go by without doing 6 approaches (actually, only did one). I wanted to fly to KAGC to pick up Laura on Friday, but I couldn’t because of a very thin looking broken layer at about 1500 feet – if I’d been IFR current I could have punched through that and been in VFR on top the whole way. This weekend was pretty clear, so I wanted to go up with a safety pilot and get current again. Saturday, I had to work. So it was Sunday or nothing. I had a brunch to go to earlier, so the plan was to get to the airport at around 1pm, and do some approaches with a safety pilot. My original plan was to do it with Jim, who wanted me to be his safety pilot as well, but he had to cancel. So I called another guy, Lance, who wanted to see what it was like to be a safety pilot. He was available.

When I got to the airport, I found the next problem: the plane I had booked, the Lance (yes, really) had a nose gear strut was almost completely flat. And even worse, the very slow leak of hydraulic fluid in the prop has turned into a veritable shower. There are red spatters all over the cowl, and a red streak covering most of the spinner. (Actually, I just this second got an email from a more experienced person who told me that the red oil is probably from the gear strut as well.) There is no way I wanted to be doing approaches in that. So Lance and I waited until one of the other pilots came back from their flight, which fortunately didn’t take too long.

Once we got into the air, I found the next problem: the breezy conditions made it quite turbulent, especially down low. I should have realized that this would be the case, but I’d put it out of my head. I went out to Geneseo VOR and did one turn around the hold – it was quite bumpy and hard to hold altitude and heading. There were two other planes doing holds there, one at 3,000 and one at 3,500, so I went to 4,000. It wasn’t any smoother up there. One turn was enough, and because of the bumps I decided to skip my usual non-precision approach to Le Roy or Canadiagua and go straight into the ILSes.

The first ILS went ok, except at about 300 feet above decision height there was a tremendous bump and suddenly the localizer needle went several dots off. I wasn’t at full deflection before DH, but it was bloody close. Getting vectored around for the second one, I was starting to feel airsick, so I told approach that I was going to make this one a full stop. Once again, I was right in the donuts until about 300 feet above DH, and it suddenly started going all wrong. At about 100 feet above DH I took off the foggles and landed uneventfully.

Now to figure out what to do with the Lance and its mystery oil leak.

If I ran Kodak…

(Disclaimer: I’m working at Kodak, but not with anything to do with Picture Kiosks. I’m not privy to any discussion of new technology or upcoming enhancements to the Kiosks.)

If I ran Kodak, I’d connect all those Picture Kiosks up to the internet with cheap DSL. Then, after you’d uploaded your pictures to OFoto (sorry – “Easy Share Gallery”, I think), you could say “Print this picture to the nearest Kiosk”, and it would tell you where the Kiosk was (and give the option to choose a different Kiosk if that one wasn’t good for you) and give you a PIN. You’d go to that Kiosk and enter your PIN, and out would come the pictures you’d sent to it. Much handier than having them mailed to you, or having to go to certain participating stores.

Is it time for a new server yet?

I’ve had my Linux server for several years now. I don’t remember exactly when I got it, but according to the time stamps on the picture gallery from when I did it, it’s been about 2.5 years since I improved the cooling with monster copper heat sinks (1 pound on each CPU). In my experience, a heavily used server like this isn’t good for more than about 3 years before it starts getting flakey. But so far it’s been solid as a rock. Since improving the cooling (and stopping running SETI@Home, unfortunately), I can’t think of a single time when it froze up or rebooted spontaneously.

So I’m thinking that although I don’t need to rush out and buy a replacement, I should at least start thinking of what to replace it with. And here’s the problem – computers have become way more powerful and fast since then. This computer was pretty fast for its day, and I could easily get CPUs that run 3 or more times the speed, but so what? There is nothing I do on this computer that stresses the CPUs in any meaningful way. Normally my load average is down around 0.01. So what would I want out of a replacement? “Bottom of the rung” CPUs, but really fast networking and disk? Something small and quiet like a Shuttle? Something maybe not top flight speed-wise, but really well built by a company that knows how to make reliable hardware like a Sun or IBM or an Apple G5?

One thing I really like about this computer, though, is the fact that it’s dual CPU. It seems to me that if one process runs away the other CPU keeps it pretty responsive until the process finishes or I figure out what’s wrong. For instance, yesterday I noticed the system getting pretty slow. “uptime” showed the load average up over 15, and “top” showed a process owned by the apache user called “oops” taking a bunch of time. One quick “/etc/init.d/httpd restart” later, and things were back to normal.

If I were to replace or improve this computer, I can only think of a few things I’d like to do:

  1. More RAM. 1Gb seemed like plenty when I got it, but since upgrading to Fedora Core 4 from Fedora Core 3, SpamAssassin seems to take WAY more memory so when I start up X and start doing stuff on the console (not very often) it actually starts dipping into swap.
  2. SATA. Right now IDE drives are wonderful and cheap, but it looks like the future is SATA.
  3. RAID. If I’m going SATA, I’d like to get a real RAID. I don’t know why, but it seems that most of the talk in the SATA world about RAID is RAID 0+1 (striping + mirroring), but I was really impressed the first time I saw a RAID 5 setup and the owner of it just yanked a drive out of the array and slapped another one in, and the application didn’t even hiccup while the RAID controller went about its business rebuilding the new drive.
  4. LVM. I like the fact that LVM can do a “transaction snapshot” almost like a database transaction, so you can backup a consistent view of the system instead of trying to copy an image of a system that’s changing while you copy it. I haven’t read if this is possible, but it seems to me that you’d be able to stop all the services that are most likely to have problems with consistency (postgres, mysql and innd for instance), start your backup snapshot, and then start those processes again, so the services would only be down for a few seconds rather than however many hours your backup took.
  5. Dual processors. Like I said, I consider that one of the best features of this current machine. Any replacement would also have to have them.