I made some changes to my FAA data loader script and ran it. Four days later it had finished running, I discovered a few bugs, and was getting ready to run it again, but I decided to see if I could improve the speed any. I’d already run the perl profiler and discovered that 95% of the time was spent in mysql. So I tried doing an “EXPLAIN” on all the queries. That’s when I discovered that one very common query was doing the dreaded “ALL” query on a 12,000 row table as step 1. Hmmm. That table isn’t even an important one, it was just table joined to the main “waypoint” table to get one field that was semi-useful. The query has a “waypoint.datasource_key = ?” in it, why isn’t it doing that first? “DESCRIBE TABLE waypoint;” showed me the error of my ways – I’d forgotten to put an index on “datasource_key”.
So I created the index and started the script before going to bed. I was astonished to discover that the script was done this morning. According to my Munin graphs, it had only taken about 6 hours to run. 96 hours down to 6 hours. Yeah, that’s a worthwhile optimization.