Archive for the ‘Job Experiences’ Category

Well, it’s been approximately two years since my last HSE (Health, Safety and Environment) orientation for contract employees, so it was time to renew.

They’ve moved it even earlier, to 7:00am. And it is just as boring and unapplicable to my type of contracting as I remember it. But at least the guy giving it this time was a much more positive person.

I had to get up much earlier than usual, and get out of the house promptly without checking my morning email or anything. I had hoped that I’d be able to catch up with my Treo in the HSE training, but of course they hold it in a basement with no cell phone coverage.

But that wasn’t the worst part. The worst part was that I had anxiety about getting up earlier than usual so I kept waking up in the middle of night to check the clock to make sure I hadn’t missed the alarm. I swear, I remember checking the clock about 3 times between 2:00am and 2:30am, and then again a couple of times between 5:00am and 5:30am. But of course when the alarm did go off at 6:00am I hit snooze, forgetting for the moment that I didn’t have enough buffer time in my schedule to allow for snoozing. Fortunately I remembered a minute or two later.

You know, I love this job. I bitch about my cow orkers and management every now and then, but that’s true of any job. But the work is challenging, it’s interesting, and it’s in a field I like (ok, it’s not GIS or aviation, but it’s close), on a language and OS I like developing on. And it pays more than my previous job, which means I don’t have to ask myself “can I afford to go flying this month” or ask my friends for money to help run the server that does our shared mailing lists, but most importantly it means we didn’t have to ask “how are we going to afford to have three kids in college at the same time?”. Being bored for 45 minutes at a god-awful hour of the morning once every two years is a small price to pay for that sort of freedom.

So after a titanic two day struggle, we’ve got my home account moved to a server with a slightly newer version of NFS, and I seem to be running again. Except I don’t have Lotus Notes or Microsoft Office. Which, unfortunately, I really need in spite of the horror of having to use them on a daily basis. It seems that when I decided to blow the machine away and re-install, I didn’t save a precious little “id” file that allows me to log into the Lotus Notes server. The help desk form for requesting help for Notes requires you to specify what your Notes server is, and shows you how you can find it on your Notes screen - which of course I can’t do because I can’t get into Notes without this id file. It also promised that they’ll get back to you within three business days. Rob has warned me that there will be two more hurdles:

  • First they will refuse to help me because they don’t support Linux, and/or because Notes doesn’t run under Linux. Evidently the fact that everybody in our office runs it under Crossover Office under Linux is just a figment of our imagination. Pointing out that this is just an authentication issue and not an OS issue evidently isn’t enough to get them to cough up this file without a fight.
  • Even after they relent and send you the file, they don’t actually send it to you, they send it to your boss. And since my boss never reads his email and his secretary can never be bothered to send me the information when I try to recover my Windows network password (too busy with eBay and Solitaire), I’m not holding out great hope of getting this either.

(Apologies to Flogging Molly)

Spent the whole frigging day dealing with the problems I was having with my computer at work. I tried reinstalling the OS (CentOS 4.2) again. I tried replacing the hard drive that was giving errors. I tried installing it on a Dell 700 I had on my desk for testing purposes. Every time I got the same problem - I’d set up a Thunderbird or Evolution account, exit the program and come back in, and the program couldn’t find the Inbox file any more. Rob suggested the problem might be due to me having SELinux enabled, so I installed it without it enabled. Mike suggested that CentOS 4.3 just came out this weekend, so I tried that. Still no luck.

After a few attempts, I was looking for the Evolution configuration, but I got a weird error when I tried a “find . -type f -print” on my home directory. I didn’t get these errors on a local file system, or when doing it on my home directory on the NFS server. After a bit of messing around, we realized that the problem might be that CentOS is now using NFS v4, and might have obsoleted NFS v2. The server that has our home directories on them is an old SGI running IRIX 6.4 - it’s possible that it only does NFS v2. Tomorrow, Rob is going to look at moving our home directories to a machine running Solaris 9. Hopefully that will fix it.

A couple of weeks ago, my work computer froze up hard while I was copying the source code tree to my thumb drive so I could do some work at home. (Yes, probably a gross violation of security rules, but it’s either that or do a lot less work.) Afterwards, I’ve had nothing but problems with ClearCase - I had to toss out a couple of the views I had, and make new ones, and even with those ones, about once every two days I’ll get some sort of I/O error and have to do a “ct recoverview -f -tag tomblin_DCOS6.0″ to get it working again. There were other problems on that machine. Plus it’s running on a 2.4 Linux kernel and RedHat 8 and all our new stuff is being developed on 2.6 and CentOS 4.2.

I decided the time had come to reformat and reinstall CentOS. The install went relatively smoothly, except at one point in the sequence I saw a message about a problem on one of my hard disks flash by too fast for me to read. Of course now that I’ve got the OS installed, I have to find and install Clearcase, Java, Jikes, Eclipse, Crossover Office, Microsoft Office, and Notes. But first I want to test that drive. I told smartctl to start a long test on both drives. I got Clearcase, Java and Jikes installed (the others can wait) and tried to do some work. And found I couldn’t, because one of my cow-orkers, who loves to “refactor”, managed to refactor a couple of files out of existance, so I can’t do a top level build successfully.

While that was going on, I tried Firefox. The Firefox that CentOS installed was 1.07, rather than the 1.5 I had been using, and I got a strange thick gray bar at the bottom of the screen below the status bar. It’s about as thick as the navigation toolbar up top, with a tiny red caret on the left side, but nothing else. I can’t seem to get it to go away, even by switching themes.

Ok, next up was Thunderbird. It opened up, and for my normal mail account, it showed “Drafts”, “Sent” and “Trash”, but no “Inbox”. I checked in the directory, and there was definitely an Inbox there. I sent myself a test message, it sent, but still no Inbox. I tried “Create a new folder”, but it wouldn’t let me create an Inbox because one already existed. Ok, I said to myself, obviously Thunderbird is hosed. How about Evolution. I opened up Evolution, and set up an account. It showed a couple of folders, but no Inbox! So I said “to hell with this”, and exited Evolution. But when I tried to blow away my .evolution directory, I got a bunch of NFS errors and some of the files wouldn’t go away.

That’s when I chucked it all and went home. I figured somebody on Monday can help me, or get me new hardware.

In my continuing attempts to keep from killing the people around me, I’m trying another way to blot out the noise around me.

At the Apple Store before Christmas, I tried out the Bose noise cancelling headphones.  Just like their aviation headphones, they were awesome, light, and way too freaking expensive .  They’re around $300, which I suppose is a bargain compared to the Series X aviation headphones which are around $1000, and have been since they invented the concept of ANR (Automatic Noise Reduction) in aviation headphones.  The aviation ones haven’t budged in price in 10 years, so I don’t expect to see the music ones getting drastically reduced either.
So I compromised and bought myself a pair of Sony MDR-NC6 headphones.  These are semi-open like the original Walkman headphones, but with a battery compartment in the bow just above the right ear.

You put them on your head and flick the switch.  The first thing you notice is that you can no longer hear the air noise in the overhead HVAC system, nor the three computers sitting right behind your head on the desk behind you.  Then you turn on the iPod and find you can use a much lower volume setting.  Not sure if that’s because everything got quieter, or because the iPod got louder.  Even near-by conversations are muted.  Hey, I don’t feel like punching the guy using his speaker phone to check his voice mail.  Much.  This is good!  And no sore ear canal from ear buds that don’t fit very well.
There are a couple of downsides, though:

  • I don’t think the bass response is very good.
  • The battery compartment presses into my head annoyingly after a lot of hours of continuous use.
  • I don’t know how long the battery lasts yet - that might be an issue.
  • When you’re walking around, if you don’t turn off the noise cancelling you get a very loud wind noise in your ears.  I have no idea why.

…it’s because I’ve had to commit Seppuku to appease our Japanese customers.

A few days ago, based on a code review (which I unfortunately did on our 5.0 code base instead of the 3.6 code that they are using) and an examination of the customer logs, I confidently said that this mysterious changing value that they are seeing is due to one of them mucking around with changing values in Webmin. I found at least one case where that had happened, and like House my default assumption is that the user is always lying, because that’s usually the case. My confidence was reported up the line by my boss, and from him to the Japanese support people, and from them to the customer.

So yesterday I was taking another look at the logs, and I found that as well as the case where they had messed around with the values themselves, I found another case where the values had changed “spontaneously”. Oh oh. And then I remembered the cache of these values I’d put in in 3.1, and how hard it had been to get everybody who used the cache to understand that if they used the cache they had to listen for a particular message, and when they got that message they had to call a method to flush and reload the cache, and how some of the other developers don’t seem to get the concept of Singletons and how something they call in one thread can affect something that happens after that thread is dead and another one spawned off, and because of that in 4.0 I’d gotten rid of the cache entirely.

After apologizing, I’m going to have some backporting to do.

Why is it that when called on to apologize to the touchy Japanese customers I feel this Basil Fawlty voice in the back of my mind saying “Don’t mention the war” over and over again? I have trouble reconciling the delicate sensibilities of the Japanese and with the brutal butchers who bayonetted Canadian nurses in Hong Kong at Christmas 1941 or who casually beheaded surrendered prisoners being force marched in Bataan. I guess that’s not very culturally sensitive of me.

In my part of the big project I’m on, I have a class called a Playlist, and a corresponding database table. Based on my analysis of how many Playlists are likely to be used in the lifetime of a system, I decided that an int would be more than adequate storage space for the sequential internal id number. Actually, a short would probably be adequate, but there isn’t any compelling reason to use shorts on modern systems since they don’t save much storage and they’re slower to process (is that true in Java? I know it is in C/C++.) And so I happily used this id all over the code.
Continue reading ‘When does a unit test become a system test?’ »

I hate where I sit at work. It’s at the corner of two heavily traffic hallways, and right across from the largest conference room in our area. So consequently at least once a day there are people hanging around right outside my cube talking. Plus the guy kitty corner from me uses his speaker phone way too fucking much (ie. ever).

So when they did a massive reorganization of our space, which involved 70 people moving from one cubicle to another, guess who didn’t move? Yup. As far as I can tell, I’m the only one in our development group who didn’t move.

  1. Got the UPS software working again, after I converted from using the mge-utalk driver to the mge-shut driver. No idea why the other driver, which has supported this UPS just fine for years suddenly started having trouble. Oh well. Such is the world of open source software.
  2. Our company photo contest results were announced today. I didn’t win anything. I guess at least part of the problem was that I ignored all the advice I got from my friends and submitted the pictures I liked best. But because of my wrist problems I didn’t have as much time to work on them as I would have liked. Oh well. Not to sound like sour grapes or anything, but the guy who cleaned up in several of the novice categories takes pictures of bicycle races and sells them at the race, which make him more than a novice in my book. Most of his pictures, though, were really good and would have won in the advanced categories as well, and my favourite one of his didn’t win anything. Actually, all the competition was really good. Of course it doesn’t help that there were three other people submitting pictures from Alaska cruises, and one person who went to Antartica.
  3. My SafeType keyboard was acting a bit weird. Every now and then I’ll be typing away and suddenly all 4 LEDs in the middle (caps lock, num lock , scroll lock and one other labelled “W”) all come on for a second or so, and a whole bunch of typing gets missed. I’ve seen this about two or three times a day at work, but when I brought the keyboard home for the weekend, I was seeing it about once a minute. Mildly annoying. I moved it from my powered USB hub to plugged in directly to the Powerbook, though, and I haven’t seen the problem since. Must be some sort of timing thing.
  4. I worked hard this week to provide a new architecture for dealing with encryption keys for our digital cinema product. Today the guy who has to use these keys comes over and starts talking about unresolved issues and use cases. My thought was why didn’t he think through these issues and use cases before he asked me for this new architecture? The upshot is that I have to totally redesign the architecture again, back to something a little more complicated than the original, but much less complicated than the one I did this week. And since development has to be finished by the end of next week, I guess I’m going to be billing some hours this weekend. Normally I’d be really annoyed at the wasted effort, but I enjoyed the intellectual challenge of that code I wrote this week.

I work at a company that employes thousands of people. Early this morning, Corporate IT send out a message to everybody at the company with a Lotus Notes (bleargh) account warning them that a couple of servers are going to be upgraded this weekend and Notes might not be available during the upgrade. That was immediately followed by a deluge of idiots using “Reply All With History” to ask why they were getting this mail, followed by a few dozen people saying “Stop using Reply All”, followed by more people saying “You’re doing it too, you idiot”, followed by the original few dozen saying “if I hadn’t done a Reply All, I’d only be talking to people who’d already done it”, followed by several petty flame wars, followed by still more people saying “take me off this list”. It’s the most traffic I’ve seen on Lotus Notes since I came to the company - as a matter of fact, it probably outnumbers all the Lotus Notes I’ve had in total since I came to the company.

Fortunately my Unix mail account is still working.

I’ve spent the last couple of months doing an all-singing, all-dancing automatic upgrader for our customers sites. This process is designed to be totally hands-off - you stick the DVD in the drive and type “upgrade”, and at the end of the theatre day it will convert the main “cms” computer (one per site) and all the “cp” computers (one per projector) from Redhat 7.3 to CentOS 3.4, upgrade from version 3.3 to 3.6 of our software, and magically preserve all your settings and configuration. You should come in the next day to find everything ready for the day’s schedule.

For the very first one at a customer site, though, they sent out a technician to babysit it. Unfortunately they send a techician who’d never done or witnessed one of our many test upgrades in-house.

You probably guessed what happened - she saw the cms come up, didn’t realize that the cps start after the cms is done, and rebooted the cms at the worst possible time - right when all 18 cps were attempting PXE (network) boots and expecting the cms to be there to send them what they needed. And the cms doesn’t start dhcpd by default, so the cps have had nothing to talk to all night. And of course everybody is screaming for me when I got in!

I just got a phone call from my boss. “So with everybody else I should multiply their estimates by four, but with yours I should divide them by four?”

How did this happen? He had called me a bit earlier and said that they’d changed their mind about postponing a “PCR” (probem report) that had been assigned to me, and wanted to know how long it would take to fix it. Since I hadn’t really had a good look at it before they’d postponed it, I said “Maybe four hours.” “Can you have it before you leave for the day?”, he asked. I said that I’d do my best. I looked, and found that 90% of what I needed had already been done elsewhere, so I needed to cut and paste some code, do a tiny bit of tweaking, and Bob’s your uncle. I did a very rudimentary test, and it worked, so I checked it in and marked the PCR as “Resolved” about an hour after I’d given him the four hour estimate.

As Scotty said, “how else would I keep my reputation as a Miracle Worker?”

Our “feature players” are computers with no keyboard, no screen, just a tiny little 4 line dot matrix display with a few touch buttons. One of our sales and demo sites, in China, had a little problem - due to a bug in the version of software they’re still running they desparately need to rename a file before tomorrow morning. And in order to connect a keyboard and screen, or even a network cable so they could ssh in, they’d have to remove it from a rack and move it out of the projection booth. Don’t ask me why.

There were around 5 people at work trying to solve this problem, including my supervisor, the guy who signs the POs to pay us contractors, and two other developers. They called me into the meeting to discuss it at 4:00pm. At 4:15 I’d finished explaining how I’d solve it. At 4:30 I had a first burn of a CD that would solve the problem. At 5:00pm I’d fixed a couple of strange little problems with the first two cuts at it, and had an ISO that they could email to the site in China, have them burn it, and put it in the DVD drive and fix their problem.

(Technical details follow, you might want to skip this part)
Continue reading ‘I’m da man, baby’ »

You know what’s even more annoying than having to reburn a DVD and spend two hours preparing a test? When you go to burn the DVD and it hangs up your entire computer, flashing the caps lock and scoll lock keys on your keyboard, forcing you to power cycle the computer. And it happening not just once, but four times with your last two DVD blanks. THAT is annoying.

You know what’s annoying? Having to rebuild two systems back to the previous version, burn a new upgrade DVD (when you’ve only got 2 DVD blanks left) and upgrade the two test systems back to the new version with the upgrade DVD, all because you put “||” where you should have put “-o”. That’s annoying.