Ever told yourself “oh, I don’t have to sanitize the input because I’m the only person using it” only to have it bite you in the ass?
When I’m loading waypoint data into my database, I calculate the magnetic declination of each point using a program that I got years ago and hacked the hell out of. I call it magvar, because declination is sometimes (incorrectly) called “magnetic variation” and I didn’t know any better when I did it. The program as it was written prompted for input (latitude, longitude, elevation and date) one number at a time, validated it, parsed the World Magnetic Model (WMM) data file, and told you the declination and a bunch of other stuff about the location. Well, I needed it to be faster than that, so I had it pre-parse the WMM file, then it sat there in a loop where it did a "sscanf"
of the four input numbers (C programmers are now shuddering in horror), and printed the output, and then my perl script did an "open2"
to open a pipe to write the four numbers on, and another pipe to read the result. And that’s worked pretty well up until today.
Today I was loading some new datasource, and I noticed that about 75% of the way through, it was hanging. And it appeared to be hanging in the write to the magvar program’s input pipe. I tried commenting out the call, and it ran fine, but of course it didn’t have any declinations. So I put the call back in and ran it again. And then attached to the executable with gdb (some old atrophied skills suddenly got refreshed in memory). And that’s where I discovered that the program seemed to be stuck in a write. And going up a few levels into the code that I’d touched and dumping the local variables, the input latitude and longitude seemed to be indicating a waypoint that was one of the first ones input. That’s when I had another look at the data I was feeding the program. And that’s where I discovered instead of writing 4 doubles that scanf could happily read using "%lf %lf %lf %lf"
, I hadn’t noticed that on some of the waypoints in this new datasource, the elevation was given as "apprx 123"
. I didn’t bother to look in detail what happened at this point, but I assume my unchecked input caused the magvar program to go into an infinite loop, spewing out the same declination value over and over onto the perl program’s input pipe until the pipe filled up.
And I haven’t learned my lesson – I have no plans to fix magvar to validate its input. I’m just going to make sure this particular data loader program does a
$elev =~ s/\s*apprx\s*//;
before calling it.