Design iteration

I have a web page that shows a bunch of data (trouble tickets) on different tabs.

In the first iteration, I was doing a ton of queries at load time, and building up the content of each tab using Perl Mason. Tabs that had no data on them were not even created on the page. There were three problems with this:

  1. It took a while to load the page
  2. Any time you did anything that would change the tickets that are on the page (like resolve or reassign one of them), I’d have to totally reload the page.
  3. You’d probably have to occasionally reload the page to see if anybody else had done anything that would cause more or fewer tickets to appear on one of your tabs.

In the second iteration, I changed it so each tab would only load itself when you clicked on it by making an AJAX call. This did wonders for the speed of the initial page load, you’d see the latest and greatest information on the tab when you clicked on it, and if you resolved or reassigned one, it would repopulate just that one tab. As an added bonus, I added paging and sorting to each tab. I was happy about the paging and sorting. The biggest problem is that I couldn’t hide the tabs with nothing in them, because I didn’t know there was nothing in them until you clicked on them. I didn’t like that.

In the third iteration, I added an argument to the AJAX call that would allow it to just return the count of tickets in the tab, instead of actually returning a page’s worth of tickets. This is fairly fast. So now when it goes to refresh tab “B”, it makes simultaneous ajax calls to get the count for tabs “A”, “C”, “D”, etc. This means that tabs that have no tickets are disabled, giving you a good visual indication of which tabs you need to look at. Also, any time you interact with the page, in the background it’s checking to see if any of the other tabs need to be enabled or disabled. I’ve checked in Firebug and it’s apparent that it does all these other tab count AJAX calls while the “repopulate the currently selected tab” AJAX call is processing, so it’s nice and fast. I’m pretty happy with this.

Next iteration would probably be to add some searching or filtering.

After that, maybe using a WebSocket or periodic polling to see if anything has changed instead of only refreshing when you interact with it.

Today in technology

A week or so ago, I bought an Auria EQ276W 27″ IPS Monitor – this is one of those very high resolution IPS monitors you read about, but instead of having to buy them from dodgy unknown companies on eBay like the “Catleap” monitor that everybody is raving about, it actually comes from a brick and mortar store with branches all over the country. It is the brightest, sharpest, most beautiful monitor I’ve ever owned in my life. I haven’t been this excited about a monitor since I bought a then top-of-the-line Iiyama 17″ semi-flat CRT monitor.

The monitor needs either a dual-link DVI or DisplayPort to drive it at the full 2560×1440 resolution. The nVidia GeForce GT430 video card I bought when I bought this motherboard has dual-link DVI and HDMI output, and the on-motherboard video has DVI, DisplayPort and HDMI outputs and I was sort of hoping that I could hook up both of my existing 21″ monitors to that, and maybe buy another one of these monitors in the new year and drive it off the DisplayPort, but it appears that when you have a video card, the onboard video is disabled and can’t be turned back on. However, I was pleasantly surprised that the HDMI port on the GeForce card still works, so I can still have one of my two 21″ monitors working. There is a second PCI-X slot on the motherboard, so there is still a possibility of having a second one of these lovely 27″ monitors some day.

There are still a few flies in the ointment. Every time I have to reboot the computer or just log off and log on again, I have to re-tell Kubuntu that I want the left monitor to display its own stuff, not a rehash of the first 1920×1080 pixels of the big monitor. The weird things is that when I check deep in the bowels of the .kde directory, I can see that it has saved that xrandr setting, but then when I log in it doesn’t seem to get applied. Also, every so often when the screens go to sleep, only the small screen will wake up again. Which is inconvenient, because the menu bar with the log out button is on the big screen. So far, every time it’s happened I’ve managed to wake it up again by hitting shift-control-F1 to switch to one of the virtual terminal logins, and then hitting control-F7 to switch back to the graphical screen, although last time I had to do this several times before it woke up. Oh, and I should mention that the monitor’s own on-screen menu won’t show up if it doesn’t think there is any input signal, which is stupid because one of the functions of the on-screen menu is to switch which input signal it should be looking for if it doesn’t manage to auto-detect it.

Another tech thing comes with living in an old plaster and lath house. My wifi base station puts out two signals, the conventional “Tomblin-Robinson” (TR) on the normal frequency band (2.4GHz?) that 802.11b, g and n all share, and the higher speed “Tomblin-Robinson-5G” (5G) which uses the 5GHz(?) band that only 802.11n uses. I prefer to use “5G” one because it’s noticeably faster, but down in the dining room, both bands show one or two bars of signal strength, but they’re actually unusably slow. And Vicki uses her laptop mostly in the dining room. So I installed a “wireless repeater”, which picks up the signal in the library, and repeats it. It creates a new network called “Tomblin-Robinson-EXT” (EXT). I’d prefer not to use it, because every packet it sends out has to be picked up by the repeater, and sent out again on the TR network, and then the base station sends it out to the internet, and then the response comes in, and gets sent out on TR, then picked up and repeated on EXT. So the presence of EXT slows down RT and EXT, and using it slows them both down even further. If there was only a way to run a wire from the base station to the repeater, the slowdown would be less, but there isn’t any way to do that in this house. But seriously none of this would be a problem if in the dining room, the TR and 5G networks would just show 0 bars instead of 2 bars, because then my laptop and iPad would automatically switch to EXT when they needed to. But instead, I have to manually switch between them, and I usually only remember to switch when I’m sitting at the dining room table wondering why my Weight Watchers app is stuck in “Updating Points Tracker”.

My work project, which is currently being done in Perl, looks like it’s transitioning to Python and Django. That’s great, because I’ve been looking for an excuse to learn Python. A few weeks ago O’Reilly Books was having an ebook sale and I bought a couple of Python books. I was amazed that O’Reilly has an option now where for delivery they’ll just drop it in your Dropbox, because that means that they can update it when they want. Unfortunately, it turns out that the Python books I bought are all for Python 3, and Django only supports Python 2. So it’s off to find older versions of those books or other books that still cover Python 2. (And I wonder how long it will be before somebody writes a book about the Django framework with the word “Unchained” somewhere on the cover?)

Upgraded to Kubuntu 12.10

A few weeks ago I got seriously pissed off about all the things that were broken on my Linux box, not least the fact that since the last time I upgraded Ubuntu the program “aptitude” kept telling me that I had to uninstall several hundred packages, including some that looked like majorly important ones, so I bit the bullet and did a fresh install of Kubuntu 12.04. The fresh install went ok, the usual few glitches and things that needed to be reconfigured. But then almost as soon as I got all that sorted out, I got a notification that Kubuntu 12.10 was out. And I figured that since I hadn’t done all that much since installing it, an upgrade would probably be no sweat.

My first indication of trouble was after it rebooted – I got a “grub rescue” prompt, and bugger all else. I tried a few things that are supposed to allow you to put in your boot partition and boot, but none of them worked. So I hauled out the CD I’d used to install Kubuntu 12.04 and booted into rescue mode. I mounted all the partitions, did a grub_install /dev/sda and rebooted, and I was back in business.

The second problem was that none of our laptops could print to the print queue that is shared out by the Linux box. I had made sure that the CUPS config files hadn’t changed, but evidently that wasn’t enough. I got the two Mac laptops printing to it by “changing” them from ipp to ipps print queues. (I should mention that neither Macs nor Windows boxes actually let you look at the existing print queue and change things like the URL). On the Windows box, I think what I had to do was change the print queue from using the name “PSC_1500_series” to “PSC-1500-series”. No idea what else I changed (because of the aforementioned problem seeing how you defined it already) but I think that was it.

The third problem was worse – this morning I got an email from somebody who reads his email on my box saying he hadn’t gotten any email since the upgrade. I looked in the mail log, and what I could see is that the local deliver program had been changed from procmail to /usr/lib/dovecot/deliver -c /etc/dovecot/conf.d/01-mail-stack-delivery.conf -m "${EXTENSION}" That was an extreme WTF moment. Further investigation revealed that this config file specified maildir instead of mbox. I just changing it to mbox, but then it complained that it didn’t have permission to write to /var/mail/ptomblin. I couldn’t find an option to tell this deliver program to run setgrp to mail. I also discovered that something had screwed up my postfix configuration to add this local delivery option, and also remove a bunch of my spam protection checks. So I removed the mail-stack-delivery package and the postfix-dovecot package, and restored all the config files. Things seem to be working again. And I used the formail command to process all the files in the various people’s “maildirs” and put them back in their mboxes.

My next trial and tribulation is that my hourly backup program, which uses lvm snapshots and rsync, is intermittently screwing up. Sometimes it can’t unmount the snapshot partition, and sometimes it can’t remove it (with the message Unable to deactivate open lvm2-home-real (252:12), and sometimes it just fails for no reason. I know there are a ton of race conditions in lvm snapshot stuff, so I already had a “sleep 10” after the lvremove. I added another one after the umount that preceeds the lvremove, in case umount suddenly got lazy and the reason it’s failing is that it hasn’t finished unmounting the partition. That seems to have quelled the major problems, but the lvremove command is spitting out the message /sbin/dmeventd: stat failed: No such file or directory and I need to figure out how to suppress that so I don’t get emailed every hour.

How to debug

I see an awful lot of posts on StackOverflow that show that the person asking the question hasn’t got the slightest clue how to go about debugging their problems. So here’s a few specifics for a few extremely common situations:

1. It’s not a bug in the compiler
It’s not a bug in the compiler, it’s never a bug in the compiler. Stop making that your default assumption. I’ve been a programmer for over 25 years, and the only time I saw a bug in the compiler was in the early versions of cfront, which was AT&T’s way to convert C++ programs into C programs so you could compile and link them with C tools. If you think there is even the slightest possibility that it’s a bug in the compiler, you’re going to stop looking before you see what you did wrong. And yes, you did something wrong. Similarly…

2. It’s probably not a bug in the library routines
The probability of a bug in the library routines depends a lot on the number of people using it. If it’s a core part of Java, chances are you’re not the first person to notice something the other 25 million Java developers somehow overlooked. If it’s a project that you found on SourceForge that hasn’t been updated in 4 years and only had one developer, it’s a possibility, but one you should discount until you’ve made sure you’re calling it right.

3. Null Pointer Exceptions happen for a reason
If you got a NullPointerException in your code, or any type of exception in library code, you did something wrong. Look at the stack trace. Look for the top-most entry that is your code. Look at the line there. Think about what you see on that line. Can one of those variables be null? Did you initialize all the variables in every possible way through the code to that point? Are you giving the correct arguments to whatever library code you’re calling? If necessary, put a breakpoint there or throw in some debugging statements and print out what you’re using in that line to make sure they’re what you expected.

4. Debugging AJAX calls is hard, but it’s easier than trying to explain it on StackOverflow
A large number of questions are of the nature “I do this ajax call, and it doesn’t work”. What doesn’t work? Are you making the call? Is the server receiving it? Is the server doing the right thing with it? Is it passing back what you’re expecting?

The first thing you need to do is use a good debugger. If the problem happens on Firefox, then you’re in luck because you can use Firebug + Firequery.

If you’re unlucky, and the problem only happens in IE (and face it, those are your only two alternatives because if a problem happens in any non-IE browser, it happens in all of them, whereas if you code works in IE8 you’re not 100% sure it works in IE7 or IE9), then you need to use whatever debugger options are available to you. I found some useful information here and I end up using a combination of Firebug Lite and IE Developer Toolbar. Fortunately most of the IE8 and IE7 problems I’ve encountered happen in IE9 with the Browser Mode and Document Mode set appropriately.

Once you’ve got your debugger up, you want to set a breakpoint on the actual ajax call (to verify that you’re actually getting to the call and not missing it for some other reason), on the success callback (to verify that the server has sent a response) and on the failure callback (to verify that the server didn’t throw up its hands and give up). It also helps if you’ve got access to the server side logs and can see what’s going on there as well, but that’s often not possible like when you’re calling somebody else’s web API. In the IE debugger, you need to go to the Network tab and “Start Capturing”, and in Firebug you just need to look at the “Console” tab. After the ajax call returns, you can look at the appropriate tab and see what was sent to the server and what came back. And in the success callback you can look at the returned response and single step through the logic to see if you’re doing the right thing with it. And you can do all that in less time than it would take to write a question to StackOverflow. If you’re still stumped, you also have a lot more information you can put in your question, which will help all the eager question answerers out there who don’t have the ability to step through your code.

5. A question and answer site is not the place to learn the syntax of a language
If you code doesn’t even compile, then you don’t know enough to ask a question that is useful to either you or the other users of StackOverflow. Pick up a book and learn the basics.

Internet Exploder, I hate you so much

Yesterday was a fun day in the continuing struggle against IE brokenness.

First problem: the form submit button used to work on IE, but now it doesn’t. Well, no matter, because the form had an onsubmit that did some AJAXy stuff and then cancelled the form submit. Rather than wasting time trying to figure out why it works on real browsers and not on IE, I just changed the submit button into an ordinary button that invoked my function. Problem solved.

Second problem: My form is very dynamic, allowing you to add, delete or clone table rows, each of which contains several select, checkbox, and textarea input fields, all with associated onchange or onclick callbacks. The problem was that when you cloned a row, the callbacks on the new row would apply to the original row. All the callbacks had the row id in the arguments list, and when I clone I use the jquery attr command and a regular expression to change the row id. That works for real browsers, and it apparently works in IE (if you examine the code in Firebug you see the new id), but apparently the actual callback data is stored internally somewhere. It didn’t seem to matter whether I called clone with true or false in the copyData argument. So I restructured all my callbacks so the were activated by the jquery on command, and grabbed the row id and other arguments using the jQuery(this).parents('tr').

It was annoying to have to do all this stuff because IE is so different from real browsers, but the code is probably better for it.