Spammers must die, Wiki Spammers doubly so.

A little while ago I set up a Wiki to discuss the imminent loss of the Digitial Aeronautical Flight Information File (DAFIF). Yesterday, the spammers found it. For two days now, some asshole has registered a new account and make his “user page” a spammer link farm. Now I could continue going in and ripping these pages out, but that could get really boring.

I think what I need is to replace the simple Wiki software I’m using (TWiki) with something with a bit more features. Specifically, I’d like to make it so that you can’t start editing until you give the software your real email address, like the way my mailman mailing lists require you to get and respond to a confirmation mail.

Any suggestions? (And yes, I know I asked for suggestions before and then ignored them all in favour of one that was easier to install. I’ll go back and read your suggestions from that time as well.)

SQLite again

I wrote a few days ago about a problem I was having with SQLite, or rather with DBD::SQLite. Turns out that one should never assume that the version you installed on one machine is the same as the version you installed on the other. After making sure the machine I was testing on was up to DBD::SQLite version 1.11, that part of it worked fine.

I’ve been doing some timing tests on the a generator task that generates 26915 waypoints, and doing one at a time it takes about 45-50 seconds and doing two at a time takes twice as long (about 1:30-1:40), as opposed to MySQL which takes 3:40 for one at a time, and 4:45 for two at a time. The fact that the SQLite one takes twice as long when there are two running makes me think it’s probably CPU-bound. The fact that it’s way, way faster than the MySQL alternative makes me think this is definitely worth pursuing.

But there’s a wrinkle. According to a post on the SQLite mailing list, one program can’t commit a write while another one is doing a query, even if the writes don’t involve the same tables. I guess SQLite’s database level locking is pretty stupid. But that’s the problem – there are three different types of things going on:

  • Database reloads – these only happen about once a month, only one at a time, and involve reloading FAA and DAFIF data into the waypoint, comm_freqs, runways and other similar tables. The reload scripts can take a hour or more to run.
  • Database generations – these run in the background, and just query those same tables that the reloads are loading. Lots of these run can run at once, and lots are run every day. As mentioned above, they only take a few minutes to run.
  • Choosing generating options in the web site. These tasks run after clicking one form page on the navaid.com generator and generating the response page. These mostly do some queries, but as well they track what options you’ve chosen so that they will be the defaults next time you come back. It does that by updating some tables which are not involved in the database generations.

Obviously a user doesn’t want to be sitting there waiting for their page to load for as long as it takes somebody else’s generation script to run. I’m going to have to try putting the tables that are used for storing these options in a different database to see if that will enable the pages to update in a reasonable time.