Oh, that isn’t good.

I was trying to tar a bunch of stuff off a USB backup disk onto the new machine, and it suddenly started throwing all sorts of errors and couldn’t read any drives, not even the root drive to find the shutdown command.

First thing I’m going to check is moving the drives around, because I accidentally put the two new drives in the third and forth slots instead of one and two, so I’m going to fix that. If that doesn’t help, then I’m going to just turn of the Adaptec RAID controller and try a software RAID. If that doesn’t work, I don’t know what I’m going to do. Probably return the hardware and start again.

Setting up new server checklist

Don’t mind me, I’m just recording what I’ve done so far in setting up my new box.

  • Ordered new server
  • Ordered new rails for server
  • Ordered two 1Tb drives for server
  • Installed drives in server
  • Discovered rails were the wrong kind for this server
  • Grovelled around the net and found the right type of rails, ordered them.
  • Installed Debian on the server.
  • Tried just blasting the entire backup of the old server onto the new one was a disaster, went to Plan B.
  • Discovered that i386 Debian works fine, except neither the Xen nor the Bigmem kernels boot.
  • Downloaded and tried to install ia64 Debian, only to discover that’s the one for Itanic.
  • Downloaded and installed amd64 Debian. Xen kernel working fine.
  • Installed and configured munin. Discovered smartd doesn’t work because I’m using an Adaptec RAID controller. Tried to install dpt-i20-raidutils, but they don’t seem to work either. Copied some third party munin Xen nodes from old box backup.
  • Installed sshd. Copied “authorized_hosts” from old backup, configured it to only allow public key authentication.
  • Configured the dom0 to take less memory. 96M was plenty on the old box, but this one didn’t boot until I increased it to 128M.
  • Make lvm disks for the domUs.
  • Copied one of the backups. Had to change the sxp file to specify the amd64 kernel, and copy the /lib/modules/*-xen-amd64 to the disk space. It boots, but for some reason it won’t start up the network.
  • Copied another backup. This time it booted the amd64 kernel just fine, but got a lot of errors on start up. But it did connect to the internet and stuff, so I’m not sure how critical the things that didn’t start up were. May have to try installing an i686 kernel and booting the xen instances with that.
  • The box rebooted spontaneously while trying to copy a lot of files over at once. Will have to try again without the memory restrictions (and maybe with the non-xen kernel). Will also have to make sure that it doesn’t do anything bad if one of the domUs is doing heavy i/o.
  • Tried again copying everything over with the non-xen kernel with 4Gb, and it still died.
  • Tried to disable RAID controller, didn’t work. So made 4 separate 1-disk “Volumes”, and go back to install Debian amd64 again.
  • Configured with /dev/sda with 2Gb /boot, 1Gb swap, rest available. /dev/sdb with 2Gb /, 1Gb swap, and rest available. Made “available” parts of two disks into an MD0 software RAID 1, then made that into a PV for LVM.
  • Overnight untarring of backups of mp3s and xen1 didn’t crash it. Woo hoo!
  • Installed sshd, copied config from old dom0, tested sshing in with a public key.
  • Installed xen stuff, and munin-node.
  • Untarred backups of xen2-3.

Next steps:

  • Copy the backups verbatim onto those disks, and hope like hell that Xen can boot them.

IT’S ALIVE!

The GoogleBox lives!


Yes, after 4 days of downtime, my illustrious yellow 1U server has been revived from the dead. After it died, I asked the colo provider to power cycle it, and they said it didn’t come back up. I asked them to yank the box, and I picked it up and brought it home. Expecting a power problem, I first tried yanking all the hard drives and memory, but even then didn’t get any beeps or other activity. So then, I tried yanking one of the CPUs. I must have gotten lucky, because removing the #2 CPU got me a couple of POST beeps, and when I put back the memory and the hard drives, it booted just fine.

I’ve had this box since January 2007, and the CPUs are dated from 2001, so I guess it’s time to replace it. I ordered a slightly newer box off eBay that has twice as much memory and 4 SATA drive bays. With two 1Tb drives, it will have much more disk space, but more importantly the empty drive bays mean that if I need to expand, I can add newer bigger drives when they become available. I’m considering using software RAID to mirror the two drives, because even 1Tb is bigger than what I have now and I’m not hurting for space. And with lvm, I can plonk in two new 2Tb drives when the time comes, migrate the volume groups to the new drives, and yank the old ones. All the remains now is to decide whether to build a new OS and get everything working on it, or just restore from a backup and continue the upgrade path.

While I had the box home for a few days, I took the opportunity to do a long-delayed upgrade from Debian “etch” to Debian “lenny”. I didn’t want to tackle that remotely because there was a significant chance (and it happened several times) that I was going to get it into the situation where I needed to intervene at the Grub stage, and I couldn’t do that remotely (because the cheap colo facilities don’t give you remote boot consoles like the expensive ones do.) The biggest hassle of this upgrade was that I had to do some messing around to get a console to appear, changing the boot options on the box itself, and also the getty lines in inittab of each of the guest “domU”s. The second biggest hassle was that I had to install “udev” on all the guests so that ssh could work. Also while they were home I took the opportunity to make a backup of the whole thing, including the guests that don’t belong to me. Normally I just back up my own. That should make setting up a new box a lot easier.

I got all the fixing and upgrading and backing up done early this morning. I brought my box over to the colo company office at 10:15. And waited. And waited. And waited. I had a “ping -a” running so I’d know as soon as it came back. And I waited some more. The business office is the other side of town from the colo facility, so I figured there would be some delay. The company that used to own those racks would let me go out to the facility with them, but these guys bought the business from that company and they’re anal about security and won’t let me go. Well, it turns out that their scheduled visit to the colo facility was at 10pm – nobody told me that, of course, until after I’d started panicing that they’d all gone home for the weekend without racking my box. But here it is – they racked the box, phoned me to say it was powering up, and now I’m connected again. Hallelujah!

Chrome – still not great

Update: A few hours after I wrote that, I decided to quit and restart Chrome to free up some memory, and now none of the extensions I installed are showing up.

A few weeks ago, I wrote about my experiences with Google Chrome on the Mac. At the time, Chrome on the Mac was lagging quite far behind the Windows version. Supposedly now it’s all caught up, and so I’m going to revisit my previous complaints:

  • It frequently lost the text cursor in text input fields, especially on GMail.
    • Still happens.
  • It seemed much slower and more likely to corrupt the display compared to Safari in Google Wave.
    • I haven’t been using Wave, so no comment.
  • It had a bad habit of undocking a tab on the slightest provocation.
    • Still happens.
  • The fact that the tabs take up space in the window frame means that you’d frequently undock a tab when you were trying to move the whole window.
    • Still happens. There is a tiny bit of real-estate near the “+” to open a new tab that is still available, but it’s a pain to grab.
  • It doesn’t have a “Reload all tabs” option. Supposedly there is an extension to that, but in order to use extensions I’d have to upgrade to the latest development build. That’s more work than I’m willing to do when it has all these other problems.
    • I found an extension that will reload individual tabs on a schedule rather than the whole window on demand. That’s actually nicer than having to reload everything manually. It’s not bad, except when your computer goes to sleep you have to restart it by reloading all the tabs individually. Plus when it reloads on a tab that is on a different Space than the one you’re on, it will switch back to that Space, but that’s a Spaces problem not a Chrome problem.
  • It doesn’t recognize or tell you about RSS feeds. In Safari or Firefox, any page that has an RSS feed displays an icon, and if you click it, the OS opens the feed in the currently configured RSS reader. The functionality is so ingrained in browsers that many pages don’t seem to have any other indication that they have RSS feeds. Once again, I’m told that Chrome has a plug in for that. Once again, too much trouble.
    • The only RSS plug ins I could find will add the RSS feed to a web based RSS reader like Google Reader. There is no support I can find anywhere for the OS-defined RSS reader. So I’m experimentally putting NetNewsWireLite out pasture in favour of Google Reader. Not bad, but not great.

So over-all, it’s got a few user interface annoyances, but the really big sticking points have been taken care of by plugins. And I was happy. Until today. And that’s when I discovered that Google Chrome is utterly useless for a web developer – there appears to be no way to make it reload your javascript file that you’ve just changed unless you go to “File->Clear Browsing Data”, uncheck everything except “Empty the cache”, click “Clear Browsing Data”, and wait, and wait, and wait. In normal web browsers, you just have to hit shift-reload on your page and it will reload that page and all the attendent files, including CSS and JavaScript files. That’s it, I’m switching back to Safari (or maybe Firefox) for the page I’m developing.

Oh, plus the built in “Developer Tools” in Chrome suck in comparison with Firebug, but that’s apples to oranges since Firebug is a plugin.

Spot the irony

Update: It turns out that the way I’ve been creating the smart playlist, with “Genre = Podcast”, which worked for years now, suddenly stopped working. Changing it to “Media Kind is Podcast” and making it sync under the Podcast tab worked.

Thanks to the latest iPod and iTunes updates from Apple, the iPod, the very device that “Podcasts” are named after, has become useless for listing to podcasts the way I want to listen to them.

The way I like to listen to podcasts is in the car, while driving, a time when I probably shouldn’t be poking around the screen of my iPod instead of watching the road. But Apple, in its infinite wisdom, made podcasts different from music or audiobooks in that you can’t (by default) click “Play” on them and listen to them one after the other. Instead, you have to pick one, hit play, and when it’s done, find another one, hit play, and lather, rinse and repeat. Until a few days ago, I had a very nice work-around: I made a Smart Playlist that contained “Genre = Podcast + Playcount = 0”. It worked great.

But now there is a new update for the iPod and iTunes, and they’ve broken it. The playlist still shows, and I can still play it in iTunes and it plays all the way through and the ones you listen to remove themselves from the playlist. Beautiful. But even though that playlist is still checked to sync to the iPod, the playlist doesn’t show up anywhere on the iPod. So how the fuck am I supposed to listen to an hour and forty five minutes of podcasts, some of which are only 3 or 4 minutes long, without spending time poking around on my screen instead of watching where I’m driving?

Maybe it’s time to find a podcast app for my Palm Pre.