UX fail

Last week, in an effort to broaden my horizons, I joined a bunch of groups on Meetup, including one for “UX” (User Experience). There was a meeting scheduled for today (Wednesday) at 6:00pm, and I clicked the button to indicate that I’m planning to go.

Almost immediately, they announced that they were moving all future communications from Meetup to another similar service which appears to be mostly oriented towards start-ups, something which I’m not at all interested in. (Been there, done that, lost my t-shirt and 17 SAN.)

Then they announced that they were changing the location, but they weren’t sure where they were changing it to.

Then they announced it was at the “Center for Student Innovation” at RIT, but with no further details of where in this building.

I got there at about 5:40. There were no signs indicating where it was. I went on-line and discovered that they’d announced a room number at around noon today. The room was in use, and it looked like a class or a seminar going on. I sat down to wait for 6:00. 6:00pm came and went, and whatever was going on in the room never broke up, nobody entered, one person left, but the door remained closed. Nobody else appeared to come up to the door to try it or ask where the meeting was. I decided that either the group I was attempting to meet was in that room, but nobody had told them that an open door is more welcoming than a closed one, or the regulars saw the closed door and decided to go somewhere else without bothering to put up a sign or troll the lounge area looking to see if anybody was waiting to join the meeting. Either way, I felt unwelcome so I left.

So the User Experience experts managed to give me a lousy User Experience and wasted my evening. Thanks guys.

Let’s get realistic here for a moment.

The fact of the matter is that my shoulder is not getting better. My pain level is actually worse than it was before my first surgery, and has not been getting any better for two years. Basically every time I do my physiotherapy exercises, which I’m supposed to do every other day, I’m in pain for 3 or 4 days afterwards so they don’t get done as often as they should.

So barring some miracle happening in the next couple of months, I’m facing either not kayaking, or kayaking in pain. Judging by the way it’s gone in the past when I’ve tried to continue a sport with pain, if I’m really lucky I’ll get maybe one year to recover my fitness, and another year to race, and then the pain will be too great to continue – if I’m unlucky I’ll wimp out of the pain in March, sell all my boats and go back to being a limpet. So I guess the realistic thing to do is to prepare myself to train and race in pain, and hope for a miracle. And the best training for training in pain is to start doing my physiotherapy exercises in spite of the pain that they cause me. Who knows, maybe they’ll actually start doing me some good?

Design iteration

I have a web page that shows a bunch of data (trouble tickets) on different tabs.

In the first iteration, I was doing a ton of queries at load time, and building up the content of each tab using Perl Mason. Tabs that had no data on them were not even created on the page. There were three problems with this:

  1. It took a while to load the page
  2. Any time you did anything that would change the tickets that are on the page (like resolve or reassign one of them), I’d have to totally reload the page.
  3. You’d probably have to occasionally reload the page to see if anybody else had done anything that would cause more or fewer tickets to appear on one of your tabs.

In the second iteration, I changed it so each tab would only load itself when you clicked on it by making an AJAX call. This did wonders for the speed of the initial page load, you’d see the latest and greatest information on the tab when you clicked on it, and if you resolved or reassigned one, it would repopulate just that one tab. As an added bonus, I added paging and sorting to each tab. I was happy about the paging and sorting. The biggest problem is that I couldn’t hide the tabs with nothing in them, because I didn’t know there was nothing in them until you clicked on them. I didn’t like that.

In the third iteration, I added an argument to the AJAX call that would allow it to just return the count of tickets in the tab, instead of actually returning a page’s worth of tickets. This is fairly fast. So now when it goes to refresh tab “B”, it makes simultaneous ajax calls to get the count for tabs “A”, “C”, “D”, etc. This means that tabs that have no tickets are disabled, giving you a good visual indication of which tabs you need to look at. Also, any time you interact with the page, in the background it’s checking to see if any of the other tabs need to be enabled or disabled. I’ve checked in Firebug and it’s apparent that it does all these other tab count AJAX calls while the “repopulate the currently selected tab” AJAX call is processing, so it’s nice and fast. I’m pretty happy with this.

Next iteration would probably be to add some searching or filtering.

After that, maybe using a WebSocket or periodic polling to see if anything has changed instead of only refreshing when you interact with it.

Upgraded to Kubuntu 12.10

A few weeks ago I got seriously pissed off about all the things that were broken on my Linux box, not least the fact that since the last time I upgraded Ubuntu the program “aptitude” kept telling me that I had to uninstall several hundred packages, including some that looked like majorly important ones, so I bit the bullet and did a fresh install of Kubuntu 12.04. The fresh install went ok, the usual few glitches and things that needed to be reconfigured. But then almost as soon as I got all that sorted out, I got a notification that Kubuntu 12.10 was out. And I figured that since I hadn’t done all that much since installing it, an upgrade would probably be no sweat.

My first indication of trouble was after it rebooted – I got a “grub rescue” prompt, and bugger all else. I tried a few things that are supposed to allow you to put in your boot partition and boot, but none of them worked. So I hauled out the CD I’d used to install Kubuntu 12.04 and booted into rescue mode. I mounted all the partitions, did a grub_install /dev/sda and rebooted, and I was back in business.

The second problem was that none of our laptops could print to the print queue that is shared out by the Linux box. I had made sure that the CUPS config files hadn’t changed, but evidently that wasn’t enough. I got the two Mac laptops printing to it by “changing” them from ipp to ipps print queues. (I should mention that neither Macs nor Windows boxes actually let you look at the existing print queue and change things like the URL). On the Windows box, I think what I had to do was change the print queue from using the name “PSC_1500_series” to “PSC-1500-series”. No idea what else I changed (because of the aforementioned problem seeing how you defined it already) but I think that was it.

The third problem was worse – this morning I got an email from somebody who reads his email on my box saying he hadn’t gotten any email since the upgrade. I looked in the mail log, and what I could see is that the local deliver program had been changed from procmail to /usr/lib/dovecot/deliver -c /etc/dovecot/conf.d/01-mail-stack-delivery.conf -m "${EXTENSION}" That was an extreme WTF moment. Further investigation revealed that this config file specified maildir instead of mbox. I just changing it to mbox, but then it complained that it didn’t have permission to write to /var/mail/ptomblin. I couldn’t find an option to tell this deliver program to run setgrp to mail. I also discovered that something had screwed up my postfix configuration to add this local delivery option, and also remove a bunch of my spam protection checks. So I removed the mail-stack-delivery package and the postfix-dovecot package, and restored all the config files. Things seem to be working again. And I used the formail command to process all the files in the various people’s “maildirs” and put them back in their mboxes.

My next trial and tribulation is that my hourly backup program, which uses lvm snapshots and rsync, is intermittently screwing up. Sometimes it can’t unmount the snapshot partition, and sometimes it can’t remove it (with the message Unable to deactivate open lvm2-home-real (252:12), and sometimes it just fails for no reason. I know there are a ton of race conditions in lvm snapshot stuff, so I already had a “sleep 10” after the lvremove. I added another one after the umount that preceeds the lvremove, in case umount suddenly got lazy and the reason it’s failing is that it hasn’t finished unmounting the partition. That seems to have quelled the major problems, but the lvremove command is spitting out the message /sbin/dmeventd: stat failed: No such file or directory and I need to figure out how to suppress that so I don’t get emailed every hour.

How to debug

I see an awful lot of posts on StackOverflow that show that the person asking the question hasn’t got the slightest clue how to go about debugging their problems. So here’s a few specifics for a few extremely common situations:

1. It’s not a bug in the compiler
It’s not a bug in the compiler, it’s never a bug in the compiler. Stop making that your default assumption. I’ve been a programmer for over 25 years, and the only time I saw a bug in the compiler was in the early versions of cfront, which was AT&T’s way to convert C++ programs into C programs so you could compile and link them with C tools. If you think there is even the slightest possibility that it’s a bug in the compiler, you’re going to stop looking before you see what you did wrong. And yes, you did something wrong. Similarly…

2. It’s probably not a bug in the library routines
The probability of a bug in the library routines depends a lot on the number of people using it. If it’s a core part of Java, chances are you’re not the first person to notice something the other 25 million Java developers somehow overlooked. If it’s a project that you found on SourceForge that hasn’t been updated in 4 years and only had one developer, it’s a possibility, but one you should discount until you’ve made sure you’re calling it right.

3. Null Pointer Exceptions happen for a reason
If you got a NullPointerException in your code, or any type of exception in library code, you did something wrong. Look at the stack trace. Look for the top-most entry that is your code. Look at the line there. Think about what you see on that line. Can one of those variables be null? Did you initialize all the variables in every possible way through the code to that point? Are you giving the correct arguments to whatever library code you’re calling? If necessary, put a breakpoint there or throw in some debugging statements and print out what you’re using in that line to make sure they’re what you expected.

4. Debugging AJAX calls is hard, but it’s easier than trying to explain it on StackOverflow
A large number of questions are of the nature “I do this ajax call, and it doesn’t work”. What doesn’t work? Are you making the call? Is the server receiving it? Is the server doing the right thing with it? Is it passing back what you’re expecting?

The first thing you need to do is use a good debugger. If the problem happens on Firefox, then you’re in luck because you can use Firebug + Firequery.

If you’re unlucky, and the problem only happens in IE (and face it, those are your only two alternatives because if a problem happens in any non-IE browser, it happens in all of them, whereas if you code works in IE8 you’re not 100% sure it works in IE7 or IE9), then you need to use whatever debugger options are available to you. I found some useful information here and I end up using a combination of Firebug Lite and IE Developer Toolbar. Fortunately most of the IE8 and IE7 problems I’ve encountered happen in IE9 with the Browser Mode and Document Mode set appropriately.

Once you’ve got your debugger up, you want to set a breakpoint on the actual ajax call (to verify that you’re actually getting to the call and not missing it for some other reason), on the success callback (to verify that the server has sent a response) and on the failure callback (to verify that the server didn’t throw up its hands and give up). It also helps if you’ve got access to the server side logs and can see what’s going on there as well, but that’s often not possible like when you’re calling somebody else’s web API. In the IE debugger, you need to go to the Network tab and “Start Capturing”, and in Firebug you just need to look at the “Console” tab. After the ajax call returns, you can look at the appropriate tab and see what was sent to the server and what came back. And in the success callback you can look at the returned response and single step through the logic to see if you’re doing the right thing with it. And you can do all that in less time than it would take to write a question to StackOverflow. If you’re still stumped, you also have a lot more information you can put in your question, which will help all the eager question answerers out there who don’t have the ability to step through your code.

5. A question and answer site is not the place to learn the syntax of a language
If you code doesn’t even compile, then you don’t know enough to ask a question that is useful to either you or the other users of StackOverflow. Pick up a book and learn the basics.