What a day so far

First off, our power went out. A quick survey of the neighborhood showed it was out all up and down the street, and a call to RG&E revealed that it was a tree down over a power line.

Then while I was taking a break in the back yard, unable to work because of the lack of power (although in retrospect I probably should have mowed the grass since I’m going to have to work tomorrow to make up lost time), I got an email from a member of the flying club asking why my email address (and a non-functioning email address at that) was listed as a technical contact for their domain, and can I help them transfer the domain over to their control. Doing what googling I could do on my phone showed that the current registrar are notorious domain hijackers. Oh oh.

Once the power came on, the main router was flashing a green “power light” and not connecting. Again, doing what limited web searching I can do on a tiny smartphone screen shows that this means the firmware is corrupt, and it can happen if the router loses power (which seems like a pretty shitty failure mode – you’d think the firmware could only be corrupted if it were in the process of updating the firmware, otherwise it’s not exactly what you’d call “firm”, now is it?) The solution is to download the latest firmware and reflash the ROMs, which is difficult if you don’t have an internet connection. Fortunately I have two of these routers, one at the other end of the house to act as a wireless repeater. So I grabbed that one and did a factory reset, and then reconfigured it as best I can. That was a bit of a hassle because at some time in the past I changed the name of our wifi from either Robinson_Tomblin to Tomblin_Robinson or vice versa, and I couldn’t recall which, and so when I got it wrong the iPad and iPhone happily connected to it, but the printer, the TiVos and the Nexus 7 wouldn’t.

With network connections re-established (sort of – every router configuration change seemed to involve losing it again for a time up to a minute or so), it was time to download the new firmware, enable tftp in the Windows laptop, and flash it. Amazingly enough, it actually worked. Then I reconfigured that router, and everything was back in business.

Except now my security camera isn’t working. Down to the basement to unplug the POE cable, plug it back in, and it’s working.

Now it’s time to look into the flying club business. Thank goodness for searchable mail archives – the club asked me to transfer the domain to them in February 2011, and I did. And they were using that infamous domain thief as their registrar. And at the time I pointed out that they’d need to reset all the various contact email addresses. I also gave them a list of email forwards I had set up for their domain, and they decided to turn them all off. So phew, it’s not my problem and not my fault and if they can’t remember how to log into their registrar account and change the email address, too bad for them. I feel sorry for them, and I don’t wish them ill will, but the relief of it not being something I have to help fix is overpowering all that.

Why I want to punch Microsoft in the face

Ok, not the company, just anybody who was ever involved in their web browsers.

I’m writing a web application. I’m trying to make it modern with good UX (User Experience). Sometimes my boss’s decisions go against that desire, but I do what I can. Real world requirements aren’t always as straight forwards as the stuff you read in “Design For Hackers”.

So this week, I did a new part of the app. It was finally working the way I wanted to on real browsers, so then I turned to IE testing. It didn’t work right on anything older than IE 10. After two days of screwing around, I had a workaround that worked ok on IE 8 and 9 – it didn’t look too much worse than it does on real browsers, just different. That’s good, because the boss says that IE 8, because it comes on Windows 7, is the corporate standard and I don’t have to support IE 6 or IE 7. So I upload my test code to their server and clicked on the link, and it looked like a dog’s breakfast. Turns out that Microsoft, in their infinite wisdom, have decided that when something is on your intranet, should run in “compatibility mode”, which basically means it acts like IE 7.

IE is supposed to recognize a header, “X-UA-Compatible”, which is there so the web developer can tell the browser which version of IE it’s written for, but because Microsoft are a bunch of idiots, they decided that the “use compatibility mode on the intranet” setting should override this. I can’t think of a single reason for this, other than sheer idiocy.

On StackOverflow, a user offered up a “simple” workaround – all you need to do is get every web server on the corporate intranet except yours to change to serve up a “X-UA-Compatible” that specifies compatibility mode, and then the sysadmins to change the default setting on the Active Directory servers (and probably Citrix as well) to make sure people’s logins allow the setting from the web server to take precedence over their login settings. That of course pre-supposes that you can even find every web server on the corporate intranet. And find their owners. And get those owners to sign anything without 12 years of running around making business cases and getting manager approvals. And then get the web servers actually configured that way.

I think it would just be faster to wait for every computer in the company to be replaced by one running a better OS. Or the heat death of the universe.

So off I go to try to find a work-around that works on IE 7 as well.

Upgrades are never easy

Debian stable just updated. Usually when Debian drops a new “stable”, it means its bombproof as hell and tested out the wazoo. This time, I’m not so sure that is true.

First candidate is a virtualbox that I use to keep some client data on an encrypted partition and safer than just leaving it on my desktop machine.

First attempt threw some errors about problems with “default-jre” and “openjdk-6-jre”, but I don’t use java on this virtualbox so I just removed them.

Second attempt gave a huge problem because of some conflict between CPAN installed Perl modules in /usr/local/share/perl/5.10… and the new 5.14 modules. It seems to me that the installer should just remove /usr/local from the Perl paths and ignore any locally installed stuff.

I tried removing that directory manually, but by that time the install was so screwed up that I actually went back to a clone I’d made of the virtualbox and tried again. This time I removed the JRE stuff and moved /usr/local/share/perl out of the way. The upgrade went much more smoothly, except the screen goes totally blank for a long time during the upgrade, and when it’s done the reboot prompt is showing empty boxes instead of letters. Fortunately I guessed correctly as to which box was the “ok” button.

After it upgraded, I discovered that Postgres 8 was marked as deprecated, so I did a pg-dumpall, removed it, imported the dump into Postgres 9, and all was well, no problems. Then I had to get RT working again, so I used aptitude to install as many of the packages as I could that formerly had been in /usr/local/share/perl. The only one I couldn’t find a deb for was Plack::Handler::Starlet, so I let CPAN install it.

Once that was up and running to my satisfaction, I figured it was time to move on to my linode. The linode hosts my navaid.com databases and a bunch of mailman mailing lists, and not much else. Remembering the Postgres 8 to 9 thing, I made sure to pg-dumpall before I started. There were no files in any local perl directories, and no jre, so I was good to go.

As it was updating, I saw it removing the Postgres 8 version of postgis. Oh oh, I thought, that’s not good. I’ve discovered in the past that you can’t simply recreate a postgis database using a pg-dumpall dump. So after the upgrade, I of course tried to install postgis for PostgreSQL 9, and once again panicked as it dragged in a ton of X11 crap I don’t need. Then I tried and failed to do a restore of the dump file. What I ended up doing was

  1. creating the database user for that site
  2. creating the databases for that site
  3. running the scripts that come with postgis for creating the spatial functions
  4. coping the pg dump file, and cutting out anything related to other DBS, and cutting out the drop and creation of these DBS.
  5. running this cut down version of the dump file
  6. making another copy of the dump file that includes all the other DBS, including the drop and create commands and running it.

Everything seems to be running now.

Some time I’ve got to go on and upgrade my xen host and guest oses on my colo box, but I’m really reluctant to do that one because if something goes wrong, I’ll have to drive in and try to fix it while standing in a freezing cold server rack farm.

Something strange is going on…

There is something strange going on with my colo box. I tried to reboot it last month and it didn’t come up – I had to call my provider and get them to power cycle it. Nothing useful in the logs.

Yesterday I had to install a security update to the xen hypervisor, but I didn’t reboot. This morning, I discovered that the websites working on the xen guest (the domU in xen parlance) were not working. So I tried to log in, or ping, and discovered it wasn’t talking to the network. Fortunately the xen host (aka dom0) was working – I could log into it, then use xm console xen1 to log into the guest. Couldn’t find anything wrong, except it’s not talking to the network. Even “ifdown eth0; ifup eth0” doesn’t cure it. So I tried to reboot the guest, but it didn’t seem to come back up. I wondered if the hypervisor update I installed yesterday was the problem, so then I rebooted the whole computer, and it didn’t come back up either.

I drove down to the colo facility, and connected a monitor and keyboard, but nothing showed up. On the front panel, there are a couple of blinking lights. I power cycled. It came up just fine. Logged into the host, xm consoled into the guest, verified that I could ssh out, and from my home computer I could wget a few web pages from it. Issued a reboot command, and it booted just fine. Poked around the BIOS settings to see if there was something about not booting if there wasn’t a keyboard or something stupid like that, but couldn’t find anything. Booted, verified once more, and came home.

Until the next time, I guess.

A month with ownCloud, and I’m out

I really wanted to like ownCloud, the “Dropbox you host yourself” (my description, not theirs). It seemed so promising – I could have as much space as I wanted, it would be more private, etc etc etc. But I’ve had it installed for over a month now, and seen numerous upgrades to both the client and the server, and I’m ready to uninstall.

  • The 64 bit Linux client crashes and dies constantly. I’m actually somewhat surprised on the rare mornings when I wake up and discover it hasn’t crashed overnight. Yes, it crashes when nobody is modifying any of the files on any of the systems that share those files, sometimes multiple times a day when I can be bothered to restart it.
  • All the clients continually put up an error indicator at random times and then clear up at other random times. Again, usually when nothing is happening.
  • While it offers CalDav sync and I was able to add the calendar to my phone, iPad, and laptop calendars, I wasn’t able to use it to break free of the Google Calendar hegemony because I share calendars with other people and I couldn’t very well ask them to convert as well. They had a web based calendar of their own, but it is incredibly basic compared to say, Google Calendar or iCloud calendar.
  • Similar problem with the contact syncing.

Frankly I would have put up with all the other problems if the 64 bit Linux client wasn’t such a flakey piece of shit. But I see no reason to keep going with this if I’m getting no synchronization between my main desktop and my other computers.