I wrote about some work I’ve been doing on the Waypoint Generator in Rants and Revelations » Getting there, still some collateral damage. In that, I said I wanted to do some more testing. Well, I did. I reloaded the entire DAFIF dataset. The test took 4 straight days to run, and that’s not including losing a day or so when my router lost its mind. And what this test told me is that the new algorithm for eliminating duplicate points is overzealous.
For instance, it classified two Canadian airports, CYEE Midland/Huronia and CNL8 Wyevale/Boker Field, as being the same. They’re actually nearly two nautical miles apart.
I was calling points the same if the types matched and they’re within 0.05 degrees latitude and 0.05 degrees longitude of each other. Unfortunately that is just about 3 nautical miles in the north/south direction, which this test has shown is too wide a net.
The problem is that I want to spot duplicates when a waypoint changes id, AND when they update the coordinates. I’ve seen places where they’ve updated the coordinates by half a degree, especially in the case of user-entered data.
I think what I’m going to have to do is trust that the coordinates aren’t going to change a whole bunch at the same time the id changes. So what I’ll do is call something a duplicate if it’s within 0.05 degrees if the ids match, but within 0.01 degrees if the ids don’t match. That’s less than a nautical mile, and it would be pretty odd to find two airports within a nautical mile of each other. (A lot less odd to find heliports or reporting points, unfortunately.)
Damn, this means another multi-day test run, unfortunately.