Bad news on the Linode front

I’ve started to port my navaid.com applications to run on the Linode. And I’ve got trouble. If I have three or four simultaneous generator processes running, they all will stall out. And I don’t mean that one will keep running while the rest stall, I mean that none of them will make any progress for 10-15 minutes, and then suddenly they’ll all start running again. I’m seeing load averages go up over 10 during the stall, and then come down to between two or three while they’re running, and then go back up and stall.

I think the problem is a lack of RAM. The only explanation for such high load averages and the stalls that I can find is that if the (virtual) machine is doing a lot of swapping – certainly while the load average is in the stratosphere “top” is reporting almost no CPU usage. And I can’t really see that paying $5/month for an extra 16Mb (bringing me up to 112Mb) is going to help a lot. What I really need is the sort of RAM I have on my home server (1024Mb) or a way to keep these processes from getting so big.

The individual “postmaster” processes are quite big – I wonder if turning on autocommit might shrink that. I’d turned it off, hoping that having the generator processes happening in a transaction would mean that if I load the data in a separate process while somebody is generating that they won’t get an inconsistent view.

And the CreateCoPilotDB processes are huge – at least part of that because the Palm::PDB code just puts everything into hash tables in memory until it’s all collected, and then writes it out en-mass at the end. There’s a reason for that – near the beginning of the PDB file is an index with the file offset of each individual record – and you can’t tell the offset to the first record until you know how big the index is going to be. But I had a thought about that last night – maybe I can write the actual records off into a temp file, and only store the relative offsets from the start of the first record in memory. Then at the end I can write the header and the index using (sizeof index + stored offset), and then append the temporary record file onto that file. Might be worth a try.

I have another problem where sometimes the web page progress meter will time-out and show a server error instead, but I can just raise the Apache TimeOut parameter from 30 seconds to 120 seconds like I have it at home.

I’m not sure what I’m going to do if I can’t fix the performance issues. Possibly bring the navaid site, or at least the Postgres database part of it, back onto my home server. I don’t like that idea at all.

Being a news administrator would be so much easier…

…if it wasn’t for the damn users.

There is a guy who emails a complaint about anything remotely spammy that has any connection to NCF – either posted through NCF, or mentioning an NCF email address, or whatever. Half the time his complaints are pretty trivial – one of my lusers followed up to an off-topic post without putting it back on topic or something. But the other half of the time, he’s the only warning I have that one of my lusers is doing something wrong.

Today I got a spam complaint from him – somebody posted an ad for his “online pizza ordering” service to ott.events. Yeah, I saw it before he sent me the complaint. Sure it’s off-topic, but frankly if you got rid of all the off-topic posts in ott.events it would have only had two posts in the couple of years since it was created. So I didn’t pay any attention.

And then a few hours later, he sent me a bunch more complaints – the same guy was evidently spamming every newsgroup in Ottawa. So I do a search through the news spool:

find ../spool/overview/ -type f -print |xargs grep <idiots email> | sed -e 's?^../spool/overview?/usr/lib/news/spool/articles?' -e 's?.overview:??' -e 's/[ \t].*//' > list

And sure enough there are 29 of his spams, all in ott.* newsgroups.

Time to get tough. And I just happened to have a copy of Dick Depew’s infamous ARMM bot which Dick himself send me about a year after the infamous ARMM debacle (where a bug in ARMM caused it to issue cancels for its own cancels, in an ever increasing cascade). I dusted it off, fixed the paths in it, and turned it lose. It generated the 29 cancels in no time.

And just to make sure it doesn’t happen again, I fixed bin/filter/nnrpd_auth.pl to reject any connection attempts from this guy. Of course all that means is that he’ll start spamming from Google, but at least then it’s not my problem.

I’m a bad bad man…

I noticed a bunch of hits directly to pictures in my bird gallery, not going through the gallery interface. And they all had the same referrer string. So I went to the referrer, a MSN group, and was assaulted by the sort of schmaltzy, cornball, overwrought pretention that makes the people on alt.cuddle look sane and well adjusted in contrast. Pictures of fairies and unicorns and that sort of crap. Anyway, somebody put up a page on this group which just leeched my bird pictures, and other people’s as well, and titled it “My Beautiful Birds”.

First of all, they aren’t your beautiful birds, they’re mine. And secondly, didn’t your momma ever teach you about copyright?

So I went into my Apache configuration file, and a quick

RewriteEngine On
RewriteBase /albums
RewriteCond %{HTTP_REFERER} ^http://groups.msn.com/<cornball site>.*$ [NC]
RewriteRule .*\.(gif|jpg) http://xcski.com/~ptomblin/pork.jpg [R,L]

and the picture leechers are going to get a big surprise.

Paypal idiocy

As somebody who gets more than his fair share of spam (see this post for the gory details), I see several attempts a day to phish my Paypal account details. So a few days ago I was a little disconcerted to see something that met every criteria for being legitimate, telling me that somebody had requested a password change on my Paypal account. There were no fake and hidden URLs, the email came from an IP that belonged to Paypal, it used my full name, etc.

And it said that if this request didn’t come from me, I should go to the Paypal page to get the phone number for their fraud contact people. So I did, using my own login bookmark rather than the URL they gave me, in spite of me not being able to see anything wrong with the URL. In a fit of extra paranoia, I even looked at the security certificate on the site.

After making me step through a bunch of voice mail options relating to phishing rather that password change, I finally got to talk to somebody, who said that a glitch in their system sent out a bunch of these and I have nothing to worry about. Ok, fine, why didn’t you save us all some time and effort and put information about that system glitch on your web site?

Today, I got an email asking me to fill out a survey based on my call to Paypal customer support. The only problem is it came from a domain other than Paypal. I’m sure there are quite legitimate reasons why Paypal/eBay would decide not to run their own survey, but in this day and age there is no way in hell I’m going to give *any* sort of information about my interactions with Paypal to a third party. (Ok, this blog post is giving information about my interactions with Paypal to lots of third parties, but that’s different – this is “push”, not “pull”.) Paypal, if you want to survey me about your customer support, you’re going to have to do it from your own email servers and your own web servers.