Drive replacements…

So after my last post, I discovered that one of the two new 3Tb Western Digital drives is throwing SMART errors, as is one of the older 2Tb Seagate drives. Well, the new one is brand new, just a few weeks old, and the older 2Tb one is just under a year old, so it’s still under warranty, so time to test the two RMA processes side by side.

I put in both RMAs on the same day. I had some problems with the Seagate web site, but I didn’t make careful note of the details and I forget exactly what the problem was. In both cases, I opted for the “advanced replacement” service where they send you the replacement drive first, and then you use that box to send back the defective one. I don’t recall either of them offering a more expedited version of the service.

The WDC drive took a few days to arrive. When the WDC one arrived, there was an option on their web site to click a single link and buy a UPS shipping label with the return address and RMA number and stuff all pre-printed. Very nice. When I went to the WDC support site dashboard, it already showed the new drive’s serial number as registered to me, and it had removed the defective one from my list of registered drives. The only problem: the dashboard showed the warranty period on the replacement one as expiring in 5 months. That’s a bit odd. I put in a ticket to ask about it and they said that when the defective one arrives back, they’ll update the warranty period back out to three years. We’ll see.

The Seagate one took over a week to arrive. The replacement has a big “REFURBISHED” label on it. I guess it’s unreasonable to expect a new drive to replace a year old drive, but one can live in hope, right? They sent me an email with instructions for returning it, including helpfully putting the return address and order number on page 4 of a 7 page email and suggesting I print it out and use that as a “mailing label”. That email also told me that I’d opted for “Ground Advanced Replacement” and if I’d opted for “Advanced Replacement” instead I would have gotten 2 day shipping on my replacement and a pre-paid shipping label for the return, all for $9.95. I don’t recall ever being offered this, or if I was, i wasn’t told how it differed from the free service. Still, the order confirmation is probably the wrong time to tell you what you should have ordered instead. Anyway, I guess I’ll be trudging off to the UPS store to get this shipped tomorrow.

Ok, now I’ve told you why Western Digital rules and Seagate drools, I’ll tell you about my replacment experiences.

When the first drive arrived, I shut down my computer, yanked the bad drive, put in the new drive, and rebooted. I got a message asking me if I wanted to start the RAID in degraded mode, and I did. Everything started up perfectly. I did the parted and mdadm magic to make the partitions on the new /dev/sdb and get it into the RAID, and after everything rebuilt it was right as rain. The number of odd DMA errors appearing in /var/log/kern.log went down to zero.

When the second drive arrived, I attempted the exact same thing. I shut down, yanked the bad one, put in the new one, and powered it up, and it refused to boot. Uh oh. Carefully checked the serial number on the drive to make sure it was the defective one. Checked in the BIOS to see if it seeing all the drives. But when I booted, I never saw the message asking me if I wanted to boot with the degraded RAID, it just hung. Put the defective drive on a spare SATA controller and booted, and it booted fine. Hmmmm. Used the appropriate mdadm commands to fail and remove the defective drive, and add the new one to the RAID. Tried grub-install, and it gave a non-fatal error about a device named “null”, but when I attempted to boot without the defective drive, got a grub error about being unable to find bios-i386-pc or something like that. Tried booting from all 4 disks, and got the same error. So I booted with the defective drive still installed, and waited 24+ hours for the RAID to finish rebuilding. Once it finished, I was able to do a grub-install and it didn’t give that strange error, and afterwards I was able to boot with the defective drive safely back in its shipping box. Phew.

More upgrades

So back when I wrote this post, my system had 2 21″ 1080p monitors, and it had a pair of 500Gb drives and another pair of 1Tb drives. But I didn’t stand still.

Over time, I replaced one of those monitors with a 27″ WQHD IPS LED monitor. That’s a lot of letters, but the important thing is it is very big, and it has a lot of pixels, and it’s beautifully sharp. I had hoped when I got it that I’d be able to keep both of the 21″ monitors as well, because the motherboard has built-in HDMI and DisplayPort, but it appears that when you have an external graphics card the built-in graphics shut down. I may be wrong about that, but I never discovered a way to enable it.

A few weeks ago I decided to treat myself and bought a second video card (this time an nVidia GeForce GT 620) to go into the second PCIe slot. That took a bit of fiddling around until I discovered that I needed to enable “xinerama” on the nVidia Settings in order to get all three monitors so I could drag windows from one to the other and cut and paste from one to the other. Without that setting it was acting like I had the two original monitors that acted like they had before, and a third monitor that would have nothing to do with them. Interestingly enough, though, KDE’s keyboard settings will no longer work – I had to go into the xorg.conf file and manually add a setting to swap the control and caps lock key. It also doesn’t have the numlock key on by default any more, although I haven’t manually fixed that. There are some weird little graphic glitches, especially on the login screen.

I also replaced my lovely and clicky Unicomp keyboard, which I spilled Diet Coke on and was experiencing some odd behaviour, with a dasKeyboard Professional Model S. It has MX Blue switches, which means it has almost as good a feel as the Unicomp, but it’s not as noisy.

Over the intervening years, I’ve also upgraded the disks. I haven’t really needed more space (although I’m getting more profligate about keeping stuff I formerly would have deleted), but as disks get old I’ve replaced them with bigger ones. So I went from the 2×500 + 2x1Tb to

  • 2x1Tb + 2x2Tb
  • 2x2Tb + 2x2Tb

There might have been an intermediate step along the way I left out. In each case, I’ve used fdisk to put a single partition on each new disk (because although I could just add the raw device to a RAID, I’ve found that making a partition allows you to use it for booting later, and as disks age out I’ve made the second set of disks into the first one, etc.) Then I’ve made them into a RAID-1 using mdadm --create /dev/mdNNN --level=1 --raid-devices=2 /dev/sdc1 /dev/sdd1, then migrated everything off the old pair to the new pair using pvcreate /dev/mdNNN; vgextend lvm2 /dev/mdNNN; pvmove /dev/mdMMM; vgreduce lvm2 /dev/mdMMM Then I’ve made sure grub knows about the new disks using grub-install /dev/sda, and usually I’m good to go.

A couple of weeks ago, one of my 2Tb drives reported some SMART errors. Nothing bad enough to trigger an email report, but enough to turn an orange warning flag on munin (basically smartctl returns an error code of 192). I took the machine down and ran SEATOOLS and it found a couple of sector errors and offered to repair them. I repaired them, and everything was fine for a week or so and then the same thing happened. At that point I decided to buy new disks. But since 3Tb disks are now as cheap as 2Tb disks back when I bought them, I figure it’s time to upgrade. So I bought the 3Tb drives.

At first when they came, I pulled out one pair of 2Tb drives and checked the model numbers. They agreed with the model numbers that SEATOOLS had reported for the bad drives, so I put the new 3Tb drives in their place, and connected the old ones up to the “other” SATA cables, the ones that don’t correspond to a drive carrier where the disks just sit on the floor with the cables hanging out. I booted everything up, and went through the whole process – the pvmove is the worst part of it, because it takes a few hours. After it was done, I was looking at stuff and realized that the new 3Tb drives had only made a 2Tb RAID! Wish I’d noticed that before I’d done all the time consuming stuff. Turns out fdisk doesn’t support 3Tb drives, and it doesn’t give you any warning before it makes a 2Tb partition on your 3Tb drive. So I did the same commands in reverse to move all the content back off the new drives onto the old one. Then I used GNU parted instead of fdisk to create a gpt partitioned disks with one 3Tb partition each. Went through the hours of migration, and it wouldn’t boot. It would boot with the old disks hanging off the side, but not if I took them out. A bit of reading revealed that there was a problem with grub and gpt disks – you needed to create a first small partition for grub to install its image to, and then the second big partition to make the big RAID on. So off I went, migrating back, repartitioning, migrating forward. All in all, a lot of wasted hours spent on this.

But after all that work, the damn thing still wouldn’t boot. I could plug the old disks back in and boot, but as far as I could tell, the raid that included those two disks wasn’t used for anything – it wasn’t in pvdisplay, and I could mdadm --stop it and the system would keep going. But if I unplugged it, it wouldn’t boot – it would show the Grub prompt screen, tell me that it was unable to read /dev/fd0 (which is odd, because I don’t have a floppy), say it was unable to read lvm/lvm2-boot, and then throw me to the grub-rescue prompt which is utterly useless. But as long as I got those two drives plugged in, it was booting so who was I to complain?

I asked on askubuntu.com, and didn’t get any response. I asked on the ubuntu forums, and got a response from a moderator who said “I don’t know much about RAID and lvm”, but who then proceeded to assert about 7 different things that were completely untrue about RAID and lvm. He also demanded that I run this tool, “boot-repair”, which would magically cure everything. Except for a couple of problems:

  • The documentation for the tool says that you can start it “from the command line”, but what they really mean is that you can start it “from the command line” if you’re running in a graphical environment. It doesn’t work if you’re away from home and sshed in. Minor, but annoying.
  • It wants to destroy your existing mdadm setup and replace it with dmraid. That’s a big nope.
  • It sends a lot of information about your system to a pastebin file, without giving you any option to edit or update some of that before it shares it with the world. Hey, look at that, it dumped out some disk sectors that have an email on it!
  • After gratuitously mucking about with my system, it didn’t actually fix anything.

Oh, and when you write to the authors of the tool, as the tool itself recommends you do if it didn’t fix your problem, you get an email that basically demands you donate some money to them before they’ll look at your pastebin file.

I tried asking on G+ as well, but the only advice I got there was a suggestion that I give up trying to boot from the gpt drive and install an SSD to boot from.

Anyway, in order to diagnose some more, I tried booting from the Kubuntu 13.10 install disk, and trying the “Try” option to get a liveCD environment working. With the “old” disks not installed, I was able to assemble two RAID-1s, one 3Tb and one 2Tb. So far so good. But then I noticed something that made my heart sink – pvdisplay was showing the 3Tb drives, but it was listing the other pv as “missing”. I suddenly realized why I wasn’t able to boot after taking out what I thought was the older failing pair of disks – because I’d taken out the newer, non-failing pair. Because mdadm and lvm successfully insulate you from worrying about getting disks in the right place and the right order, I had assumed that because the pair I was migrating away from was showing up as /dev/sde1 and /dev/sdf1, that they were the ones that I had outside the disk caddies sitting on the ground. But in actual fact, the ones sitting on the ground were actually /dev/sdb1 and /dev/sdd1. I was fooled because device letters don’t map 1:1 with specific SATA cables on the motherboard, if you put in extra drives they might end up between the ones you had before. With a growing mixture of trepidation and excitement, I checked checked the part numbers as well as the model numbers on the 4 2Tb disks, and confirmed my mistake. I put the newer 2Tb drives back in the caddies, and removed the older 2Tb drives, and everything booted correctly. And God alone knows how many steps back things would have worked if I’d bothered to check these part numbers earlier. I probably wouldn’t have had to inadvertently post the contents of an old email on pastebin, that’s for sure.

As of right now, /dev/sda and /dev/sdb are 3Tb drives, and /dev/sdc and /dev/sdd are 2Tb drives, and now that I have an extra Tb to play with, I have to figure out how to allot it. I currently have no way of knowing whether the system is booting from /dev/sda or from /dev/sdc, and I’m also not 100% sure that the 2Tb pair that I removed are the ones that had the problems in the first place. I think I’ve got a few extended SEATOOLS sessions ahead of me.

A progress meter/progress bar/progress breadcrumb

I was making a web site and there is a set of steps you have to go through. I thought it would be a good idea to have one of those bars across the top that shows you the steps you’ve done and what you have to do. You can see them when you’re checking out on Amazon, and when you’re ordering pizza on the Dominos site, among many others. The first problem I had was that nobody seems to know what to call these damn things, so it’s hard to google for them. If you google for “progress meter” or “progress bar”, you mostly find those ones like you see when you’re downloading files, where it says “20 minutes to go” and then 2 minutes later it says “10 minutes to go” and then 3 hours later it says “5 minutes to go”. If you google for “breadcrumb” you get the navigation type where you click on the items to navigate. So that was my first problem.

I eventually found one that I liked the look of, but it had one minor problem and one major problem. The minor problem was that you had to put in the number of elements at the top, which was a little tacky. The major problem was that when when I shrunk the screen down (ie. when I displayed it on an iPhone) things got all wonky and the lines and dots didn’t line up correctly. So I was thinking maybe I could do it using a table with a border along the bottom, but while I was playing around with that option, I discovered this answer in StackOverflow that would allow the divs to act like tables.

So I threw the two ideas into a blender and came up with the following CSS:
div.progtrckr-table {
width: 100%;
display: table;
table-layout: fixed;
margin-bottom: 20px;
}
div.progtrckr-table div {
display: table-cell;
width: 2%;
text-align: center;
padding-bottom: 0.5em;
vertical-align: bottom;
}
div.progtrckr-table div.progtrckr-done,
div.progtrckr-table div.progtrckr-doing {
color: black;
border-bottom: 4px solid yellowgreen;
}
div.progtrckr-table div.progtrckr-todo {
color: silver;
border-bottom: 4px solid silver;
}
div.progtrckr-table div:after {
content: "\00a0\00a0";
}
div.progtrckr-table div:after {
position: relative;
float: left;
left: 50%;
bottom: -1.3em;
line-height: 1.2em;
}
div.progtrckr-table div.progtrckr-done:after {
content: "\2713";
color: white;
background-color: yellowgreen;
height: 1.2em;
width: 1.2em;
border: none;
border-radius: 1.2em;
}
div.progtrckr-table div.progtrckr-doing:after {
content: "\039F";
color: yellowgreen;
background-color: white;
font-size: 1.5em;
bottom: -1.1em;
}
div.progtrckr-table div.progtrckr-todo:after {
content: "\039F";
color: silver;
background-color: white;
font-size: 1.5em;
bottom: -1.1em;
}

You can use it just by making a div of divs, like so:
<div class="progtrckr-table">
<div class="progtrckr-done">Pickup/Delivery Options</div>
<div class="progtrckr-done">Check Availability</div>
<div class="progtrckr-doing">Reserve Boards</div>
<div class="progtrckr-todo">Checkout</div>
</div>

When I first tried it, I still had a bit of a problem where the dots weren’t lining up correctly on a small screen. That’s when I discovered that there was something in the inherited CSS that was changing the line-height. I added a line-height: normal; to the div.progtrckr-table block, and all was well. I’m pretty happy with the results.

If you want to play with it, click the “Edit in JSFiddle” in the below:
[dciframe]http://jsfiddle.net/ptomblin/y26Xp/embedded/result/,100%,300,0,auto,border:1px solid blue;align:left;[/dciframe]

Epic rant company is epic!

A little while ago, an epic rant from the co-founder of Liberty Bottle Works went viral. You can see stories about it here and here. Now, I like a company that puts their employees family time above the “Me Generation”‘s whiny entitlement, and I thought I could use a new water bottle, so I ordered one.

It arrived a few days ago, and it’s really nice bottle. But the valve that lets air into the bottle when you suck on it makes a funny little squeaking sound. So I wrote to the company, making it clear that I wasn’t complaining, just asking if they had any strategies to mitigate the noise. The response I got has a formatting change in the middle that makes me think this is a semi-canned response, but it’s still pretty awesome:

First off thank you for your support, it really means a lot to us.

I am sorry for the noise you are getting from the straw, here are a few tips to try and help that:
Regarding the cap. We have found that a few caps (when fresh off the line) may make a signature “Liberty” noise. There has been some debate as to which spirit animal this noise is indicative of. Some folks believe it to be the illusive “purr” of the Sasquatch, however for more practical reasons we believe it to be more closely related to the majestic dolphin (who knows what sound a Tuna makes). Albeit, as cute as it is, I agree that the call of the majestic dolphin should be reserved for the ocean as it can be a little disruptive in class, at the office, the gym, or during ones morning Yoga. FYI, Sasquatches are really into Yoga.

Although the most likely cause of this unique audible phenomenon is rare and unintended, it is however by design. The noise may be due to our precision molding along with the strategic placement of the umbrella “flutter” valve as well as the assembly process of our sport caps, each individually built by hand. On the underside of your cap, you’ll see a flat rubber upside down “umbrella” that is about 1cm wide. When using the bottle in “sippa” mode via the spout, air must go through the umbrella valve into the bottle to replace the volume of water leaving the otherwise sealed container. The passage of air through the umbrella valve in association with the inherent back pressure of fluid may cause the tightly placed (hand installed) umbrella valve to flutter and vibrate thus creating the majestic acoustic sounds of the dolphin.

> How can I make it stop?
So, what can be done? First I would try opening the cap and running warm water (please be careful) directly onto both sides of the umbrella valve. Then, while the rubber is softened, use your finger on the top of the opened cap to push the little rubber “bead” (that holds the umbrella valve in place) down through the cap. What we are trying to do is create just the slightest amount of space between the underside of the molded cap and the rubber umbrella valve. That should do the trick. Replace the cap and test to see if we’ve set the dolphin free.

In most of these cases the sound will also go away once the valve relaxes with normal use. However, if your dolphin is stubborn and refuses to swim off into the ocean blue, I’ll be happy to send you a replacement cap of your choice at no charge. After all, we stand behind or products 100%.

Thank you for your support and for the opportunity to provide you with a better product, made by friends right here in Washington State.

If there are any other questions I can answer please let me know!

Thanks,

Denise Fischer
Customer Service Manager

I must say, I love these guys. If you’re in the market for a metal water bottle, you could do a lot worse.

Technical obsolescence? Nah…

So a few years ago I was kind of despairing that all the new languages and frameworks came with utterly huge learning curves. It used to be easy to learn a new language – you’d sit down with the manual, read it over a weekend, and by Tuesday at the latest you’d be an expert. At least that’s how it was for me when a language manual was the size of K&R – 228 pages. These days a language manual is more like this one – 1632 pages, and that doesn’t even get into the web development frameworks, the IDEs, the debuggers, the various Object Relation Models, the continuous integration stuff, the virtual environments, etc etc etc. It’s very daunting. And because of that, I was beginning to think I was going to be stuck in a rut called “Java” until I retired. I believe I used the phrase “I’m too old for this shit” more than once.

But that’s changed now. In the last 3 or 4 years I’ve

  • Become moderately proficient at XSLT
  • Become really proficient at DOM manipulation and AJAX code using Javascript and jQuery (and lately with Coffeescript) – haven’t felt the need to learn much about object oriented prototype based stuff yet.
  • Learned Perl::Mason and made a website using it which makes extensive use of AJAX
  • Learned Python and Django and made a website using them which make extensive use of AJAX
  • Used the Bootstrap Framework to make the front end of the Django website.
  • This week I started a side project where I’m writing a plugin for WordPress using PHP, another language I’ve never used before.

The difference is that these days I don’t let the enormity of the task get me down. Instead of trying to absorb the whole thing in one weekend, I go incrementally. I cargo-cult some code, write some more, google up the pieces I’m missing, and keep writing code. I don’t have to learn new IDEs because gvim still rocks, and I know how to use print statements to debug the way I’ve been doing for 30 odd years, so why get bogged down learning a million new things? Learn a few as you need them, and worry about the other stuff when you have time. Which, come to think of it, is pretty much how I became proficient at Unix, C, C++, Java, etc.

As an aside, I find it scary that StackOverflow’s SEO is so much better than everybody else’s so if you ask Google “set timezone in PHP” the first couple of results will be StackOverflow, and you have to look further down the page to find the official documentation. Especially since the StackOverflow hits will all have been closed as duplicates of each other. Much as I love StackOverflow as a resource, it’s usually better to find the official references if you can.