Optimize this?

The main loop query of my waypoint generator app is kind of hairy. And trying to do an “explain” on a typical query shows why it’s so slow.

explain SELECT a.id, c.pdb_id, internalid, a.type, name,
address, state, country, latitude, longitude,
declination, main_frequency, elevation,
b.category, chart_map, tpa, ispublic
FROM waypoint a, type_categories b,
id_mapping c
WHERE a.type = b.type AND
a.id = c.id AND
country in (‘US’, ‘CA’) AND
(a.type in (‘AIRPORT’, ‘VOR’, ‘NDB’) or
(category = 3 and (chart_map & 7) != 0)) AND
deletedon is null;
QUERY PLAN
——————————————————————————————————————————————————————–
Hash Join (cost=4442.82..19955.46 rows=34752 width=111)
Hash Cond: ((a.id)::text = (c.id)::text)
-> Hash Join (cost=4.83..12938.32 rows=34752 width=107)
Hash Cond: ((a.”type”)::text = (b.”type”)::text)
Join Filter: (((a.”type”)::text = ANY ((‘{AIRPORT,VOR,NDB}’::character varying[])::text[])) OR ((b.category = 3) AND (((a.chart_map)::integer & 7) <> 0)))
-> Seq Scan on waypoint a (cost=0.00..10759.48 rows=72467 width=103)
Filter: ((country = ANY (‘{US,CA}’::bpchar[])) AND (deletedon IS NULL))
-> Hash (cost=4.37..4.37 rows=37 width=15)
-> Seq Scan on type_categories b (cost=0.00..4.37 rows=37 width=15)
-> Hash (cost=2091.77..2091.77 rows=127777 width=12)
-> Seq Scan on id_mapping c (cost=0.00..2091.77 rows=127777 width=12)
(11 rows)

Adding indexes on country and type doesn’t help. There is still that nasty looking “Seq Scan on waypoint a” line. And also, another “Seq Scan on id_mapping c”, which I don’t understand at all because the joining column, c.id, is a primary key, so shouldn’t there be an index involved?

I’ve got a few ideas on how to use the spatial capability of PostGIS to improve that query, so I’m going to have to run a few tests. The first few ideas I’ve had aren’t showing major improvements in “explain”. It looks like the whole “type in … or ((chart_map & NN) != 0)” is going to force a sequential scan on waypoint no matter what I do. Hmmm.

Back to the drawing board

In order to support a new product development I mentioned in an earlier blog post, I re-did my existing waypoint database as a PostGIS geographical database. I also added some foreign keys and some other cleanup that I’ve been meaning to do for a while. But obviously, I don’t want to support two databases, so today I’ve been converting one of my existing waypoint generator perl scripts to use the new PostGIS database instead of the MySQL database it was on before, but without any actual GIS functionality. And Houston? We have a problem. Doing a full US + Canada data load on the MySQL version takes about a minute and a half. Doing the same load on the PostGIS version takes twenty five minutes. Something tells me that I need to make some adjustments here.