Archive for March, 2007

And this morning’s lessons are…

Tuesday, March 13th, 2007
  • If you’re about to start something dangerous (like an aptitude upgrade on your dom0), and you set up an at job to reboot it if it causes the box to lose network connectivity, don’t forget to atrm it if everything is fine afterwards.
  • reboot in an at job doesn’t do the right thing - my box was still responding to pings but I couldn’t get to it at all. Next time try shutdown -r now

OpenID is a go, it appears

Monday, March 12th, 2007

Last night I had an idea. And typically for me, I couldn’t sleep properly as I kept trying to remind myself about the idea. I should have gotten up and tried it, but if it hadn’t worked I would have gotten even less sleep.

Anyway, the problem I was having with the OpenID plugin is that I forgot to make the plugin’s temp directory group writable. Most people seem to be ok with just making all their blog files writable by the web server, but I worry about the number of security holes that seem to pervade PHP applications so I make all the Wordpress files belong to the user “blog” and only the ones that the web server has a legitimate reason to write belong to the group “www-data” and are group-write. When I installed this plugin, I made the tmp directory belong to the group “www-data”, but I forgot to “chmod g+rwx” on it. Duh. Even more “duh” worthy, I see that the plugin page has a FAQ that has that as item 1.

It seems to be working now. Let me know if you can’t comment.

No OpenID, I’m afraid

Monday, March 12th, 2007

I tried to install this OpenID Comments for Wordpress plugin. My luck with with WordPress plugins is about 50% success, and most of the ones that fail seem to be ones suggested by Jen. Or at least my biggest other failures have been with various plugins that attempted to embed my Gallery, and Jen seemed to have no trouble with hers. (I got it to the point where it was throwing off “duplicate primary key” errors, but no closer.)

This time, it looks like it would go off to LiveJournal, get successfully authenticated, but when it came back instead of putting the comment on the post it was posted to, it was putting it on the same non-existent blog URL as in the “-1 comments” debacle. The “-1 comments” thing was caused because LJXP was sending off the entry to LiveJournal before the comment number plugin knew what postid it was, but I hacked the code of both plugins and made it work. This time I looked for something obvious, but with no luck.

Maybe somebody can suggest an OpenID Wordpress plugin that works?

Another attempt

Monday, March 12th, 2007

I think this should do it. Famous last words. Last test got the correct comment one *and* the -1 comments thing. This should get rid of the “-1 comments”.

If this works, I’ll look into OpenID commenting (thanks for the suggestion, Jen) and I’ll strongly suggest everybody who has currently LiveJournal friended “ptomblin”, “ptomblin_rss” and “rantsnrevels” to unfriend them and friend “ptomblin_lj”.

Getting closer

Sunday, March 11th, 2007

Got the comment count thingy working, but it didn’t push to LJ correctly. Try again.

Here goes nothing

Sunday, March 11th, 2007

Thanks to a post in rich text, I’ve discovered a pretty cool little plug-in that should cross post all my posts here over to my fake blog on LiveJournal. Even better, another plugin puts a comment link on the bottom of my fake blog that directs back there so I don’t have to check there and here for comments.

In other blog news, I installed Akismet which supposedly works with SpamKarma2 to further reduce the spam problem. But for some strange reason it seems to be double counting the spam. I wrote down what my SpamKarma “spams killed” count was when I started, and since then the Akismet count has been increasing at almost, but not quite, double the rate the SpamKarma2 count has been. The SpamKarma2 count delta also agrees with the number of spams I see in the review and clean-out pages that both SpamKarma2 and Akismet present. Very weird.

Speaking of blog spam, another thing I’ve noticed recently are trackbacks where some site has copied my entire blog post and surrounded it with advertising. I don’t know what they’re playing at, but every time I see one I mark it as spam and delete the trackback. I notice that other bloggers, David Megginson for instance, don’t delete them. They probably should - I can’t see these things as anything other than an attempt to confuse search engines into directing people to their site instead of yours. It’s hard to draw the line between these trackback spammers and a legitimate “Planet” site that puts your RSS feed on a page with other blogs of a similar interest group - for instance I know my blog appears on “Planet LUGOR” and “Planet Linode”, the first because I’m a member of LUGOR and the second because I’m a former customer of Linode’s virtual private servers. It’s hard to draw the line, but I know the difference when I see it and I do draw the line.

Not a good flight

Sunday, March 11th, 2007

I went flying today. The “mission” was to deliver Paul P out to Batavia to pick up N9105X from its annual. The secondary missions were to see if the Lance, N43977 was still holding a charge from my flight two weeks ago, and could start itself, and also to get a bit of practice for my upcoming BFR.

I jumped in, hit the master, the fuel pump and moved the mixture up to prime, and hit the crank as quickly as I could. The Lance turned nearly two blades before the battery died, without catching. We hooked up the pre-heater cart battery same as last time, and as I discovered last time I needed to turn on the battery master to get both the cart battery and the Lance battery involved before it would turn over. It started pretty easily, but it sounded like it wasn’t hitting on all cylinders at first. It idled as I finished my pre-flight and office set-up and Paul P put away the pre-heater cart, by which time it was as smooth as ever.

The ceiling was at 2600 ft. Not a great day for buzzing around. Flew out to Batavia and while we’re on the CTAF I hear somebody calling in giving their aircraft type as “Boeing”. Oh, those Stearman pilots, I think, they love to confuse people with that “Boeing” call in. But then I heard him say “B-17 taxiing across the runway”. B-17??? I couldn’t believe it. But sure enough, out in the distance I could see something huge and green moving around on the airport. After we landed, Paul identified it as “Memphis Belle”.

Because of the difficulty we’d had with starting 977, I did something I’ve never done before, and would never do with a non-pilot - I let Paul out of the plane without shutting it down. Even though Paul is a pilot, I still kept my hand on the mixture in case I saw him walking towards the front of the plane. Everybody makes mistakes, and that one has killed even experienced pilots in the past.

The idea was that I’d fly around doing my stuff, and he’d call on the CTAF or text my cell phone if he had trouble and needed me to come pick him up. But the low ceiling was putting a damper on my fun - I didn’t particularly want to to steep turns 500 feet below a solid cloud deck, nor did I want to do stalls that close to the ground. I flew up north, because the cloud deck looked a bit holey-er up there. I got to the lake shore, and after buzzing up and down the shore sight seeing a big I found a hole and flew up over the clouds into severe clear at 3,500 feet. I did some steep turns, but for some reason I was starting to feel queasy. I wonder if that’s because I couldn’t see the ground? I hadn’t heard Paul on the CTAF, but in the course of my travels I’d gotten 20nm away from Batavia and had been down low at some points so maybe I hadn’t heard it.

I decided to head back to Batavia to see if I could see 05X on the ramp or in the pattern. I overflew the airport at 2100 feet, barely 500 feet below the solid ceiling, and didn’t see 05X anywhere around. My airsickness was getting pretty bad, so I turned on the autopilot and the altitude hold and opened the vents. In spite of the fact that the original plan was to wait until I’d heard from Paul P or until 13:00, I decided to head home in spite of the fact that it was only 12:40.

Almost immediately after checking in with Rochester approach, I heard 05X being told to extend his downwind, so I knew that I’d missed him on the CTAF. Good thing he didn’t need my help. I was given another long vector way around the runway 22 approach corridor, and eventually told to enter a left base for runway 25 and contact the tower. In spite of the fact that I was 15nm out, I was cleared to land. I made a nice greaser of a landing, and managed to get home without throwing up, although I made a bit of boo-boo by not taxing over the hold short line before doing my after landing checks. I blame the airsickness.

Because of how crappy I felt, I decided to deal with the faulty battery problems next weekend. And maybe schedule my BFR for some time after I can go out and practice without getting sick.

I’m now home, and of course the clouds have all broken up and instead of a broken layer at 2500 AGL like when I was flying, I’m now looking at one tiny little cloud at 3000 AGL and other than that, “clear and a million”. Sigh.

Open letter to Earthlink

Saturday, March 10th, 2007

Mail from my list server (list.xcski.com, ip 74.202.84.134) to [your customer] is getting bounced with the message:


<[your customer]>: host domain-relay.mspring.net[198.185.2.85] said: 550 Dynamic IPs/open relays blocked. Contact <openrelay@abuse.earthlink.net>. (in reply to MAIL FROM command)

And when I try to email the address that it says to contact, I get a further bounce:

<openrelay@abuse.earthlink.net>: host madm-corleone.atl.sa.earthlink.net[207.69.200.218] said: 550 Unknown local part openrelay in <openrelay@abuse.earthlink.net> (in reply to RCPT TO command)

First of all, I’m not an open relay. Never have been, never will be. And unlike you, when I have an error message say “Contact:”, I make sure the email address actually exists. Can you bozos please fix your open relay check, and fix your bounce messages?

How do you teach HTML to a blogger?

Tuesday, March 6th, 2007

There is a really interesting blog called “Strange Maps” that I syndicate on my RSS aggregator page. Unfortunately, while the content of the page is intelligent and interesting, the author evidently knows fuck all about the web. Recently, all his entries have been formatted using <h1> tags, then with <span style=…> tags to set the fonts back to something more reasonable. Since my aggregator strips formatting tags like <span> so it can impose its own format, this leads to some ugly results on my page. Another time, he made his entire post in red. When I try to explain what he’s doing wrong, he doesn’t seem to understand. I suspect he must be composing his posts in Word or something worse (if there is anything worse than Word) and pasting them into Wordpress.

How can I get through to him to stop doing this?

It wasn’t me after all

Tuesday, March 6th, 2007

As I suspected, it turns out my two 30-45 minute periods of being unable to reach my colo box weren’t my fault. It’s even possible some of the stutters I was having this weekend weren’t my fault. I just got an email from the company I rent my rack space from saying that Time Warner, the people who own the datacenter the rack is in, needs to make an “emergency software update” to fix their recent connectivity issues.

I may be off the air for some time overnight. They claim “up to 45 minutes of loss of network connectivity”, but I’ve done enough upgrades to know that means it will either take 45 seconds, or 5 hours.

I needed that like I needed a hole in my head

Monday, March 5th, 2007

As I was reading my email this morning, I noticed that 3 or 4 trackback spams had gotten through SpamKarma2, all from an IP in the UAE. I went to the SpamKarma2 page and found that as well as the 3 or 4 that had gotten through, there were also a few hundred that hadn’t gotten through. I took care of that, and was reading the rest of my email when 3 more got through SpamKarma2. All still from this IP in the UAE. Ok, this calls for bigger guns than SK2. I went to the terminal window that was tailing my logs from the colo box, all ready to “iptables” this IP out of my hair, when suddenly my terminal window stopped responding. So did my other terminal window on the dom0 of the colo box. So did all my web sites. So did my mailing lists.

I went off to work wondering if this was just a DDOS and it would come back up when they got bored of me, or if the box was truly locked up and would need a power cycle. If it was locked, I was seriously considering throwing in the towel on colo, because obviously I can’t keep the sort of uptime I demand. Even Linode was better than this, and they were getting hit by DDOSes all the time. The only thing I didn’t like about the Linode was the piss-poor amount of memory I got - 128Mb versus the 1000Mb I have on my domU.

On my way to work, I got an email from Vicki saying my blog was back up, and at the next traffic light I was able to verify that some of my other web sites were still running. Looks like I weathered the storm.

Twofer update

Monday, March 5th, 2007

1. Rebooting my dom0 seems to have fixed the network stuttering problem, as evidenced by the munin graphs not containing gaps.
2. Removing ~/Library/Preferences/Adobe\ Photoshop\ CS2\ Settings/ has cured my Photoshop problems.

Yeah, me.

Photoshop is hogging my disk! Help!

Monday, March 5th, 2007

Yesterday I was editing a gigantic Photoshop file (100,000 pixels by 2500 pixels) that I’d put 48 shots from my 8 megapixel camera into, by opening the shots 10-20 at a time, going into each one and doing a select all (splat A) and copy (splat C), closing the file in question, then going into the big file and doing a paste (splat V). Along the way I’d saved the big file a few times. Along the way I’d also done some experimenting with cropping the small jpegs, and the big file, although I’d ended up rolling back all the changes.

This morning I decided I needed to crop some files and overlay them on the big file. First thing I did was flatten the existing 48 layers on the big file. Then I opened up 10 of the jpegs, just as I had before. But when I attempted to crop one of the jpegs, I got a message that I was out of swap space. Actually, I got two popup messages. The first looks like an OS message:


Your startup disk is almost full.
You need to make more space available on your
startup disk by deleting files.
[ ] Do not warn me about this disk again

The second came from Photoshop:


Adobe Photoshop
Could not complete your request
because the scratch disks are full.

At this point, I tried a bunch of things. I exited Photoshop, I rebooted, and I started up Photoshop. I opened only one jpeg. I verified that Photoshop said that the file took up 21 megabytes in memory and there were 30+gigabytes of disk space free. Then I tried the crop tool. And I got the same popups. Before I dismissed them, sure enough “df” and “Activity Monitor” both verified that all 30+ gigabytes of disk were gone. Other tests with other files have given exactly the same results. Even if I resize the file to half the size (and it says it’s only taking up 6 megabytes in memory) it still consumes all the memory when I attempt to crop it.

Can anybody tell me what would make Photoshop suddenly change so that cropping a file should cause it to use over 1000 times as much disk space as the size of the original file?

That can’t be good

Sunday, March 4th, 2007

My dom0 is only responding to the network some of the time, as evidenced by how stuttery it munin graphs are for the last 36 hours or so, and the fact it took 4 tries before I could ssh into it. Meanwhile, the domU which relies on the dom0 for network bridging, is going just fine with no evidence of network problems.

Just in case, I’ve set up an “at” job to reboot the dom0 at 9:45. I’ll kill the job if I can figure out what is making it flaky in the mean time, but because it’s an at job it will continue even if I manage to totally pooch the network while I’m working. Cross your fingers and hope it comes up again.

Can’t always be the hero

Friday, March 2nd, 2007

I am mostly responsible for the scripts that upgrade our software from one version to the next. For the most part, it’s pretty straight forward, since we use apt-rpm to take care of upgrading rpms and making sure their dependencies are fulfilled. There is a bit of a hitch in that some of our rpms were made by people who don’t really get rpm, so the rpm just installs a tar file and the %post script unpacks the tar. Trust me, that’s more bizarre than you think, even if you think it’s pretty bizarre. Because of that and because of dependencies, we can’t just do a “dist-upgrade”, but have to upgrade our rpms one at a time.

But the biggest twist on the upgrade is that going from 3.3 to 3.6 of our software involves going from RedHat 7.3 to Centos 3.4. (Don’t ask what happened to versions 3.4 and 3.5 of our software - it’s too painful.) I am not quite proud of the horrible hack I put together to do that, but it took a bunch of work to get it working so that they can just put a DVD in the drive and type /mnt/cdrom/upgrade now and it mostly works. It uses grub on the hard disk to boot the DVD with a custom kickstart file, formats all the partitions but one, installs the new CentOS, and then uses a custom finish script to reinstall our software and restore the backed up configuration off that one partition that wasn’t formatted.

Actually, as an aside, all of the upgrades only “mostly work”. Partly that’s because we were stupid enough to put Dell computers on the customer sites, so it’s hit or miss whether they’ll even reboot after a year or two’s continuous use. But mostly it’s because rpm has an intermittent bug in the locking code that RedHat was told about at least 3 years ago, and still hasn’t fixed. Which means that sometimes apt-get fires up rpm -U, and rpm just hangs. And of course I get blamed because the upgrade doesn’t entirely work, although I think they’re starting to realize it’s not my fault.

When the upgrades don’t work, I usually get dragged in by the customer support people to log into the customer site to fix it. And usually it’s pretty straight forward - reboot the machine with the locked rpm database, and then manually step through the steps the upgrade script would have taken on that machine. Although the customer support people have recently learned that they can just re-run the whole upgrade script, because the apt-get install portion won’t do anything if it was already done.

Today I got dragged in because of a bigger problem - a customer site that is running version 6.2 of our software had a hardware crash on the main server. They sent the customer a new server, but for some stupid reason the fulfillment house that sent out the new server had version 3.3 of our software installed on it. I guess customer support sent them a 3.3->3.6 upgrade DVD, and tried to upgrade it to 3.6 remotely, and then upgrade from 3.6 to 6.2. But at that point they noticed nothing was working.

I investigated, and discovered that most of the rpms weren’t installed. Also, nothing was backed up properly in the partition that doesn’t get reformatted, so it hadn’t been restored correctly. So I decided to go back to the 3.6 version to see if I could get it working there and then go forward. Fortunately they’d left the 3.3->3.6 upgrade DVD in the drive. I ran the /mnt/cdrom/upgrade script and came back a few hours later. And sure enough, only 3 of the 8 rpms that are normally installed were installed. I tried to manually install them, but the first one failed because it had a dependency on the mozilla rpms, and the mozilla rpm was corrupt. It didn’t matter whether I tried the one in the apt repository, or on the DVD itself. It was hosed.

At this point, I gave up and said that they’d have to ship out another replacement unit with the proper version of our software installed. So much for being the hero.

But as I was leaving work, I had a few ideas on how I could have repaired that mozilla rpm and gotten it working. But I was too late. I figured even if I went back everybody was gone as well, and besides they’d already ordered the replacment - my reputation as a hero is ruined. Sigh.