Activities – Page 38 – Rants and Revelations

Dammit, Seagate

Back in January, I wrote about how two drives had failed, one brand new Western Digital and one 10 or 11 month old Seagate. I mentioned how the Seagate replacement was a refurb, and I wasn’t thrilled about that.

Well, one of my Seagate drives just started throwing dozens of SMART errors, and freezing up the computer. I’ve filled out an RMA, but I’m ~~90%~~ 100% sure the drive that failed was the damn refurb that they sent me two months ago. I’m tempted to rip them both out and buy some new Western Digitals or Hitachi drives.

Update: I pulled out the bad drive, and sure enough it has the Seagate label that says “Certified Repaired HDD” and my label that says “Installed 23 Jan 2014”. So way to go, Seagate, you repaired the drive well enough to last almost 46 whole days. Better luck next time.

Man, I hate Garmin right now.

I’ve been a huge Garmin fan, right from my first aviation GPS (a Garmin GPSMap 195), through other aviation GPSs, a nuvi in the car, a StreetMap app on my phone, right through to the Forerunner 301 that I liked so much I bought another one when it started acting strangely because they’d been discontinued and I knew I had to act fast before they disappeared. But when I went to Tarifa, I discovered that having an open USB port and salt water don’t go together well. When that Forerunner 301 died, I bought a Forerunner 310XT, which is the nearest equivalent.

The big plus for the 310XT are that it had a completely sealed case, and the big minus is that it follows the trend to more “watch-like” form factor so it doesn’t work as well mounted on the front of the kayak. The lack of a USB port means that to sync it to your computer you need to put a little wireless device in one of the USB ports on your computer. It also uses a more modern heart rate strap that uses the semi-standard ANT+ instead of the completely proprietary protocol they used in the strap for the 301. And thus my hate starts.

So far, I’ve discovered the following things that make me hate this horrible thing.

It takes forever to start showing my heart rate. With the 301, I could take the strap, lick the contacts, put it on, and then turn on the GPS, and it will sync and start showing your resting heart rate within a few minutes. With the 310XT, you can repeat those steps, and you can wait literally half an hour, and it won’t show your heart rate. Start exercising, though, and almost exactly 5 minutes later it will start showing your heart rate.

Syncing with my computer is a pain in the ass. I don’t know if this is because I’m using a Mac, every single time I want to sync with my computer, I have to:

Restart Garmin ANT Agent
Choose “Pair New Device” on Garmin ANT Agent
Turn pairing back on on the 310XT because it will have turned itself off
Wait several minutes for the wireless things to sync up

If I do any of these steps wrong, it will get into a state where Garmin ANT Agent is still searching for a new device and the 310XT is in “Data Transferring” mode, in which case you have to shut down both the device and the program and try again. Also, sometimes you’ll need to accept the pairing request on the device. Also, I can’t sync with both Garmin Connect (their web based tracking system) and Garmin Training Assistant (the desktop tracking system). Compare and contrast with the 301, where all I had to do was plug in the USB cable and then I could sync with both the web app and the desktop app in about 30 seconds flat.

Their tech support SUUUUUUCKS. Ok, so a while ago Garmin Connect said I should download a new program to replace Garmin ANT Agent called Garmin Express. So I did – but because I’m not an idiot, I didn’t remove Garmin ANT Agent. And good thing, because you go through Garmin Express and it pairs with your device (which is every bit as painful as with Garmin ANT Agent), starts downloading your data, and says “While we’re downloading, connect to Garmin Connect” and wants your userid and password. But then it gives an error:
Note that it says it’s “temporary”, but it’s been happening for a week. That’s not “temporary” in my book.

I wrote to Garmin’s tech support about this “temporary” problem, and the response I got back tells me how to pair my device (note that the message happens after you’ve paired), and tells me to click on an icon and choose something from an menu, but neither the icon nor the menu exist on this software. Which makes me wonder if the person answering my question read the part where I said I was on a Macintosh or the part where I gave the full text of the response I got, or the part where I said I got that *after* it paired and started downloading. So Garmin’s tech support isn’t exactly a reason to keep using Garmin.

So at this point, I’m looking to find either a device or an add on for my iPhone that does the following:

Displays heart rate, speed, time, distance
Has some training assistance like being able to set up interval workouts, but Garmin’s virtual training partner thing was kind of cool too.
Mounts on the front of my kayak or the footstrap area of my surf ski
Doesn’t die in salt water
Syncs with some sort of training tracker, mapper, etc.
Has a heart rate strap that doesn’t wait until you’re 5 minutes into your workout to start showing your heart rate.

Defeating the popup blocker

I’m doing a WordPress site for a local business that involves a lot of custom PHP programming, which is interesting because I’ve never done PHP before. But heck, a language is a language and you can learn anything by googling these day.

So one of the things this site does is collect a bunch of information, and then submit it to a third party who then returns a URL for the specific payment page for that specific reservation, and you’re supposed to redirect the end user to that page to pay. I had that going where there was a WordPress shortcode that generated the form, and another WordPress shortcode on the destination page that did all that stuff, and then used a cheezy Javascript window.location=$url; thing to redirect it. That worked.

But the client had a look at it and didn’t like the fact that the end user ended up on a different site, and wanted it to pop up the payment page on a different tab or page. So I changed the Javascript to do window.open($url, "_blank");, but I found out that this causes every browser in the world to see that as a popup and block it. Asking end users to disable their popup blockers is probably a no-no.

Fortunately I discovered this post. He specifically talks about Chrome, but it also seems to work on Firefox, Mobile Safari, and even IE8. So I changed the form submit button into an ordinary button. Then I added a button click handler on it which quickly opens up a new window (if you delay it by single stepping with a Javascript debugger, it triggers the popup blocker) with some hopefully quick-loading bogus content, then making an AJAX call to get the URL, and in the “done” handler for the call, do a w.location = data.url; to redirect the new window to the correct url, and then does a “submit” on the form to take you to the correct new page on the original site. The Javascript code ended up looking like:


    
    /* When you submit the booking, make a popup window! */
    $('#info-form-fake-submit').on('click', function(eventObject) {
        var w = window.open(ajaxurl + "?action=pt_fake_page");
        $('body').addClass('loading');
        var $form = $('form.pt-form');
        $.ajax({
            url: ajaxurl,
            type: 'POST',
            datatype: 'json',
            data: $form.serialize() + '&action=pt_complete_reservation'
        }).done(function(data, textStatus, jqXHR) {
            if (data.status == 'good') {
                w.location = data.url;
                $form.submit();
            } else {
                alert(data.msg);
            }
        }).fail(function(jqXHR, textStatus, errorThrown) {
            alert(textStatus + ': ' + errorThrown);
        }).always(function() {
            $('body').removeClass('loading');
        });
    });

Drive replacements…

So after my last post, I discovered that one of the two new 3Tb Western Digital drives is throwing SMART errors, as is one of the older 2Tb Seagate drives. Well, the new one is brand new, just a few weeks old, and the older 2Tb one is just under a year old, so it’s still under warranty, so time to test the two RMA processes side by side.

I put in both RMAs on the same day. I had some problems with the Seagate web site, but I didn’t make careful note of the details and I forget exactly what the problem was. In both cases, I opted for the “advanced replacement” service where they send you the replacement drive first, and then you use that box to send back the defective one. I don’t recall either of them offering a more expedited version of the service.

The WDC drive took a few days to arrive. When the WDC one arrived, there was an option on their web site to click a single link and buy a UPS shipping label with the return address and RMA number and stuff all pre-printed. Very nice. When I went to the WDC support site dashboard, it already showed the new drive’s serial number as registered to me, and it had removed the defective one from my list of registered drives. The only problem: the dashboard showed the warranty period on the replacement one as expiring in 5 months. That’s a bit odd. I put in a ticket to ask about it and they said that when the defective one arrives back, they’ll update the warranty period back out to three years. We’ll see.

The Seagate one took over a week to arrive. The replacement has a big “REFURBISHED” label on it. I guess it’s unreasonable to expect a new drive to replace a year old drive, but one can live in hope, right? They sent me an email with instructions for returning it, including helpfully putting the return address and order number on page 4 of a 7 page email and suggesting I print it out and use that as a “mailing label”. That email also told me that I’d opted for “Ground Advanced Replacement” and if I’d opted for “Advanced Replacement” instead I would have gotten 2 day shipping on my replacement and a pre-paid shipping label for the return, all for $9.95. I don’t recall ever being offered this, or if I was, i wasn’t told how it differed from the free service. Still, the order confirmation is probably the wrong time to tell you what you should have ordered instead. Anyway, I guess I’ll be trudging off to the UPS store to get this shipped tomorrow.

Ok, now I’ve told you why Western Digital rules and Seagate drools, I’ll tell you about my replacment experiences.

When the first drive arrived, I shut down my computer, yanked the bad drive, put in the new drive, and rebooted. I got a message asking me if I wanted to start the RAID in degraded mode, and I did. Everything started up perfectly. I did the parted and mdadm magic to make the partitions on the new /dev/sdb and get it into the RAID, and after everything rebuilt it was right as rain. The number of odd DMA errors appearing in /var/log/kern.log went down to zero.

When the second drive arrived, I attempted the exact same thing. I shut down, yanked the bad one, put in the new one, and powered it up, and it refused to boot. Uh oh. Carefully checked the serial number on the drive to make sure it was the defective one. Checked in the BIOS to see if it seeing all the drives. But when I booted, I never saw the message asking me if I wanted to boot with the degraded RAID, it just hung. Put the defective drive on a spare SATA controller and booted, and it booted fine. Hmmmm. Used the appropriate mdadm commands to fail and remove the defective drive, and add the new one to the RAID. Tried grub-install, and it gave a non-fatal error about a device named “null”, but when I attempted to boot without the defective drive, got a grub error about being unable to find bios-i386-pc or something like that. Tried booting from all 4 disks, and got the same error. So I booted with the defective drive still installed, and waited 24+ hours for the RAID to finish rebuilding. Once it finished, I was able to do a grub-install and it didn’t give that strange error, and afterwards I was able to boot with the defective drive safely back in its shipping box. Phew.

More upgrades

So back when I wrote this post, my system had 2 21″ 1080p monitors, and it had a pair of 500Gb drives and another pair of 1Tb drives. But I didn’t stand still.

Over time, I replaced one of those monitors with a 27″ WQHD IPS LED monitor. That’s a lot of letters, but the important thing is it is very big, and it has a lot of pixels, and it’s beautifully sharp. I had hoped when I got it that I’d be able to keep both of the 21″ monitors as well, because the motherboard has built-in HDMI and DisplayPort, but it appears that when you have an external graphics card the built-in graphics shut down. I may be wrong about that, but I never discovered a way to enable it.

A few weeks ago I decided to treat myself and bought a second video card (this time an nVidia GeForce GT 620) to go into the second PCIe slot. That took a bit of fiddling around until I discovered that I needed to enable “xinerama” on the nVidia Settings in order to get all three monitors so I could drag windows from one to the other and cut and paste from one to the other. Without that setting it was acting like I had the two original monitors that acted like they had before, and a third monitor that would have nothing to do with them. Interestingly enough, though, KDE’s keyboard settings will no longer work – I had to go into the xorg.conf file and manually add a setting to swap the control and caps lock key. It also doesn’t have the numlock key on by default any more, although I haven’t manually fixed that. There are some weird little graphic glitches, especially on the login screen.

I also replaced my lovely and clicky Unicomp keyboard, which I spilled Diet Coke on and was experiencing some odd behaviour, with a dasKeyboard Professional Model S. It has MX Blue switches, which means it has almost as good a feel as the Unicomp, but it’s not as noisy.

Over the intervening years, I’ve also upgraded the disks. I haven’t really needed more space (although I’m getting more profligate about keeping stuff I formerly would have deleted), but as disks get old I’ve replaced them with bigger ones. So I went from the 2×500 + 2x1Tb to

2x1Tb + 2x2Tb
2x2Tb + 2x2Tb

There might have been an intermediate step along the way I left out. In each case, I’ve used fdisk to put a single partition on each new disk (because although I could just add the raw device to a RAID, I’ve found that making a partition allows you to use it for booting later, and as disks age out I’ve made the second set of disks into the first one, etc.) Then I’ve made them into a RAID-1 using mdadm --create /dev/mdNNN --level=1 --raid-devices=2 /dev/sdc1 /dev/sdd1, then migrated everything off the old pair to the new pair using pvcreate /dev/mdNNN; vgextend lvm2 /dev/mdNNN; pvmove /dev/mdMMM; vgreduce lvm2 /dev/mdMMM Then I’ve made sure grub knows about the new disks using grub-install /dev/sda, and usually I’m good to go.

A couple of weeks ago, one of my 2Tb drives reported some SMART errors. Nothing bad enough to trigger an email report, but enough to turn an orange warning flag on munin (basically smartctl returns an error code of 192). I took the machine down and ran SEATOOLS and it found a couple of sector errors and offered to repair them. I repaired them, and everything was fine for a week or so and then the same thing happened. At that point I decided to buy new disks. But since 3Tb disks are now as cheap as 2Tb disks back when I bought them, I figure it’s time to upgrade. So I bought the 3Tb drives.

At first when they came, I pulled out one pair of 2Tb drives and checked the model numbers. They agreed with the model numbers that SEATOOLS had reported for the bad drives, so I put the new 3Tb drives in their place, and connected the old ones up to the “other” SATA cables, the ones that don’t correspond to a drive carrier where the disks just sit on the floor with the cables hanging out. I booted everything up, and went through the whole process – the pvmove is the worst part of it, because it takes a few hours. After it was done, I was looking at stuff and realized that the new 3Tb drives had only made a 2Tb RAID! Wish I’d noticed that before I’d done all the time consuming stuff. Turns out fdisk doesn’t support 3Tb drives, and it doesn’t give you any warning before it makes a 2Tb partition on your 3Tb drive. So I did the same commands in reverse to move all the content back off the new drives onto the old one. Then I used GNU parted instead of fdisk to create a gpt partitioned disks with one 3Tb partition each. Went through the hours of migration, and it wouldn’t boot. It would boot with the old disks hanging off the side, but not if I took them out. A bit of reading revealed that there was a problem with grub and gpt disks – you needed to create a first small partition for grub to install its image to, and then the second big partition to make the big RAID on. So off I went, migrating back, repartitioning, migrating forward. All in all, a lot of wasted hours spent on this.

But after all that work, the damn thing still wouldn’t boot. I could plug the old disks back in and boot, but as far as I could tell, the raid that included those two disks wasn’t used for anything – it wasn’t in pvdisplay, and I could mdadm --stop it and the system would keep going. But if I unplugged it, it wouldn’t boot – it would show the Grub prompt screen, tell me that it was unable to read /dev/fd0 (which is odd, because I don’t have a floppy), say it was unable to read lvm/lvm2-boot, and then throw me to the grub-rescue prompt which is utterly useless. But as long as I got those two drives plugged in, it was booting so who was I to complain?

I asked on askubuntu.com, and didn’t get any response. I asked on the ubuntu forums, and got a response from a moderator who said “I don’t know much about RAID and lvm”, but who then proceeded to assert about 7 different things that were completely untrue about RAID and lvm. He also demanded that I run this tool, “boot-repair”, which would magically cure everything. Except for a couple of problems:

The documentation for the tool says that you can start it “from the command line”, but what they really mean is that you can start it “from the command line” if you’re running in a graphical environment. It doesn’t work if you’re away from home and sshed in. Minor, but annoying.
It wants to destroy your existing mdadm setup and replace it with dmraid. That’s a big nope.
It sends a lot of information about your system to a pastebin file, without giving you any option to edit or update some of that before it shares it with the world. Hey, look at that, it dumped out some disk sectors that have an email on it!
After gratuitously mucking about with my system, it didn’t actually fix anything.

Oh, and when you write to the authors of the tool, as the tool itself recommends you do if it didn’t fix your problem, you get an email that basically demands you donate some money to them before they’ll look at your pastebin file.

I tried asking on G+ as well, but the only advice I got there was a suggestion that I give up trying to boot from the gpt drive and install an SSD to boot from.

Anyway, in order to diagnose some more, I tried booting from the Kubuntu 13.10 install disk, and trying the “Try” option to get a liveCD environment working. With the “old” disks not installed, I was able to assemble two RAID-1s, one 3Tb and one 2Tb. So far so good. But then I noticed something that made my heart sink – pvdisplay was showing the 3Tb drives, but it was listing the other pv as “missing”. I suddenly realized why I wasn’t able to boot after taking out what I thought was the older failing pair of disks – because I’d taken out the newer, non-failing pair. Because mdadm and lvm successfully insulate you from worrying about getting disks in the right place and the right order, I had assumed that because the pair I was migrating away from was showing up as /dev/sde1 and /dev/sdf1, that they were the ones that I had outside the disk caddies sitting on the ground. But in actual fact, the ones sitting on the ground were actually /dev/sdb1 and /dev/sdd1. I was fooled because device letters don’t map 1:1 with specific SATA cables on the motherboard, if you put in extra drives they might end up between the ones you had before. With a growing mixture of trepidation and excitement, I checked checked the part numbers as well as the model numbers on the 4 2Tb disks, and confirmed my mistake. I put the newer 2Tb drives back in the caddies, and removed the older 2Tb drives, and everything booted correctly. And God alone knows how many steps back things would have worked if I’d bothered to check these part numbers earlier. I probably wouldn’t have had to inadvertently post the contents of an old email on pastebin, that’s for sure.

As of right now, /dev/sda and /dev/sdb are 3Tb drives, and /dev/sdc and /dev/sdd are 2Tb drives, and now that I have an extra Tb to play with, I have to figure out how to allot it. I currently have no way of knowing whether the system is booting from /dev/sda or from /dev/sdc, and I’m also not 100% sure that the 2Tb pair that I removed are the ones that had the problems in the first place. I think I’ve got a few extended SEATOOLS sessions ahead of me.