Hey, Google

If you’re going to send a guy a survey to find out his impressions of your recruiting/interviewing methodology, you might not want to send it to somebody who has been waiting two months for you to pay his travel expense claim. Because he might just think that you’re a bunch of disorganized fuckwads. Just sayin’.

And a bit of further advice – you might want to make sure your expense spreadsheet actually prints out correctly using Google Docs and doesn’t require one to steal a copy of Microsoft Office in order to use it.

Darn it!

Some years ago I got a set of Shape Files (much as I hated ESRI at the time, they did beat us (GeoVision) fair and square with an inferior product and superior marketing) for the provincial and territorial boundaries of Canada and wrote a short little C program to do a “point in polygon” to determine what province a particular point is in. It’s set up so that it parses the shapes, then sits there waiting for a lat/long pairs to come over a pipe and writes the province code back over the pipe. I write a special lat/long pair (-999/-999) when I want it to exit. I use it all the time when I’m loading waypoints. The program has continued to work while my waypoint generator moved from being hosted at home to a webhost (Gradwell.com) to a virtual private server (Linode) to a colocation box. Unfortunately, I just discovered that somewhere along the way I lost the source code. Right now it has a small bug in that when the program that opened the pipe to it dies, it starts consuming all the CPU instead of shutting down. I can live with that, but it’s annoying while I’m doing all this testing with buggy load scripts. That’s why I went looking for the source code.

Maybe I should write a new one just in case?

Hoist by my own petard

Yesterday I thought I’d try running my loading scripts on my home machine because it’s so much faster than my colo box. And it was running faster, and had gotten further than the still running script on my colo box when I got hoist by my own petard. I like to partition my systems up with different sized partitions for different major directories, and unfortunately, on this home machine, I only gave 100Mb for /tmp, not realizing that MySQL uses a temporary directory when you’re doing a long transaction. According to munin, the script filled up /tmp and died just a few minutes before I woke up this morning. So I’ve gone into /etc/mysql/my.cnf, changed the temporary directory to /var/tmp, and restarted it. /var/tmp has 7.6Gb free, so it should be safe.

Sigh

I made some major changes to the way data gets loaded into my navaid.com waypoint generator database, mostly in the processing of the “combined user data”. Mostly, I wanted to make sure that if “Bob” provides me some data on Canadian airfields that includes communications frequencies but no runway data that it doesn’t wipe out the runway data from the dataset of Ontario airfields that “Alice” provided me last year, but only updates the data that has changed in the overlapping part of those two datasets. Add in the possibility that a waypoint might have changed identifier or been resurveyed so the location has changed a bit, and you can see that there are a lot of possibilities to consider.

Unfortunately, considering all these possibilities is time consuming. I’ve been testing these new scripts with a dataset from one person that covers the entire UK and some nearby locations in varying levels of detail, and another that covers Ireland in great detail, but which is unfortunately no longer being updated because the person who provided it moved. Running both datasets would be an overnight job. But now that I’m satisfied with the results of that, I decided it was time to reload the old DAFIF data though these scripts to get the combined user data exactly the way I want it. But this has caught a couple of bugs in the scripts, one of which only manifested itself after 36 hours or so of running. That one didn’t even give me enough information as to why it failed, so I had to add some “use Carp” and “use Data::Dumper” magic to my scripts and then I re-ran it and found the actual cause after another 36 hour run. I’ve been almost continually running load scripts all week. I’m hoping this run will be it, but I’m not sure.

Since my new home box is so fast, I’m thinking one possibility might be to do the load processing on it, and then just mysqldump it and bring the dump file up to the colo.