Danube Waltz, Part 1 of ?

We went on another Viking River Cruise this year, the one advertised as the “Danube Waltz”. We loved the first one we did on the Rhine 5 years ago, and so we’ve been really looking forward to this one. Obviously COVID put a bit of a damper on this, but Viking’s COVID protocols looked very strong – daily testing, modified ventilation system on the ships, hand sanitizer stations everywhere, mask rules, etc. As well as the standard Danube Waltz, we signed up for the 4 day “Pre Extension” in Munich and Salzburg.

I should start with a disclaimer: after the last cruise turned into a complete muddle in my mind and being unable to remember which day we saw which city, I vowed to keep proper notes and remember things. Spoiler alert: I didn’t. One of the couples we hung around with on the trip worked as a team with her taking many photos with an SLR, and him making notes just about every time she took photos. We weren’t that couple. I was doing most of my photography/video with a GoPro and unlike the phone it doesn’t tell you where the photo was taken and it’s not even 100% consistent about having the right date stamp on the file. So expect some vagueness about what happened when.

The first day was flying out of Rochester to Dulles to Munich. There’d been a huge muddle with our flights – the Viking website “My Viking Journey” continually showed the flight itinerary as “unavailable due to changes from the airline” for months and months at a time. Finally about 90 days before the departure I phoned Viking and they “finalized” the travel arrangements, only to change them again a week or so later. Once that happened, I started going to the United site to buy upgrades with our miles, but all they had was a thing where you could pay some dollars and some miles (I think it was over $500 each) and go on the wait list for the upgrades, and if the upgrades never arrived they’d supposedly refund the money and miles. (Spoiler alert: They haven’t). And then with a week to go before the flight they still hadn’t given us the upgrades we’d paid for and were waitlisted for, I noticed Economy Plus seats were available on the long leg. I decided flying across the Atlantic in economy is intolerable, so I spent another $250 each to get us Economy Plus. I still have no idea why they didn’t give them to us via this stupid wait list thing.

Our return trip was Budapest to Munich to Chicago to Rochester, and the first two legs were on Lufthansa. So I went to Lufthansa’s web site and bought business class upgrades on the first two legs. It was just a direct payment for upgrade, none of this “we’ll take your money and maybe refund it if you don’t get the upgrade” business. I’ve tried to purge how much that upgrade cost from my mind, because it wasn’t cheap.

I was a little worried because there was basically only an hour to change planes in Dulles. And even more worried when I got an email a day before departure saying there are forecasted thunderstorms on the eastern seaboard and to expect delays. And of course it happened exactly as I expected – our flight out of Rochester was delayed by over 90 minutes, and the departure of our second leg was not marked as delayed.

I got on the phone to United, and after explaining over and over what was happening to the rep with the almost indecipherable Indian accent, she said it was taken care of, we were booked on a Lufthansa flight leaving Dulles at 10:20 pm. So then I called Lufthansa to see if we could get any sort of upgrade on that flight, and they said they couldn’t because the United rep had double booked us on both the Lufthansa flight, and the United flight that they were code sharing with. So we were essentially double booked on the flight and it was screwing up their ability to assign us seats. I looked at the United App and as well as being double booked, we were still booked on the United flight we were going to miss. I called up United to get them to screw up, and once again got a person with an Indian accent. This one was slightly more understandable, but she acted like she’d never heard of the concept of missing a flight because your previous leg was delayed before. Surely this must be the main thing they do all day? Anyway, after slowly and with many, many, many repeats, I got her to understand that no, I did not want her to cancel our flights from Rochester to Dulles, I did want her to cancel the flight that was leaving Dulles before we got in, I did want her to cancel our United booking on the Lufthansa flight and NOT cancel the Lufthansa flight. It was still screwed up on both the United App and the Lufthansa App (which I had downloaded by this point because I wasn’t trusting anybody a rep told me).

When we got to Dulles, the flight we had been supposed to leave on was still on the departure boards showing something like “BLOCKED”, which makes me wonder if we could have made it, but when we got to the Lufthansa desk we found we were booked on the 10:20 flight, and we were way in the back in a middle section in economy. No upgrades were available because they were very full. So basically I had 8 hours of extreme discomfort to start the trip. Thank god for my special seat pillow. I actually think I napped a bit.

We arrived in Munich at our hotel about 24 hours after we’d left home. I’d probably had 2 hours sleep during that time, and I don’t think Vicki had any. I’m not sure if we ever got properly acclimatized to European time after that start, but because my normal sleep schedule is so fucked up by my pain it’s hard to tell.

The hotel was gorgeous, although it was very modern and in the two nights we had there I don’t think we completely figured out the weird light switches and weird shower. I don’t recall if we had time for a nap, but we did meet our Viking tour guide and the other couples on the “pre-extension”. We also nipped out for a bit of a walk around, and found an ATM to get out some Euros. We also had dinner in one of the hotel restaurants. It wasn’t a particularly beautiful part of Munich and it was a bit drizzly so it wasn’t a great intro, but we were here in Europe. So far, so good.

A brief note on COVID precautions at this point. We did a PCR test a few days before we left. I seem to recall the Lufthansa gate agent needed to see our vaccination certificates but not our test results, but the German customs wanted both, or maybe it was the other way around. We had to wear masks on the the flights, and in the Munich airport and in the bus ride to the hotel. I think some of the hotel staff weren’t wearing them, but we did except when we were in our own room, walking outside or sitting eating. Viking gave us a spit tube each to do a test the first morning after we arrived. I’ll talk about the protocols and precautions on subsequent days as I talk about those subsequent days.

A weird thought

I had a weird thought the other night. There are a couple of programming tasks on my massive “to do” list that I figured I’d power through in the first few months of retirement before I started spending hours and hours alternating between training in my kayak and touring on bikes with my wife.

Well, life doesn’t always work out the way you intended and none of my todo items has been checked off because the pain that last year made it uncomfortable to sit for too long, and which was just on the “barely tolerable” end of things by the end of a normal length kayak race has now progressed to absolutely intolerable for even short stints at a desk chair or kayak. I’ve spent about 15 minutes total all winter in my erg, and haven’t even put my kayaks in the water. Normally by this time of the year I’d have 30 or 40 hours on the erg and about the same on the water. And I limit my sitting at my desk chair to short periods to deal with bills and paying taxes and the like. Even the library easy chairs are uncomfortable verging on painful these days.

But back to my weird thought. I have a new iPad. I can’t afford a new laptop. So I was thinking that for those programming tasks, what I might try is to install “code-server”, which is a hosted version of Visual Studio Code, on my linux server. This gives you the full power of a pretty extensive IDE available through an iPads web browser. I could try coding up one of those projects using that, maybe using the git integration to push the app to a free heroku instance for testing and debugging. I wonder if that’s doable?

Well, in order to find out, first I’d have to install code-server and make it available through my web server. Oh, what’s this, it appears you need to use nginx as your web server rather than Apache to do that. Well, no problem, I’ve been intending to make that switch to make it easier to use LetsEncrypt to put everything behind https like I should have years ago. Oh wait, one of my sites uses Perl fastcgi. Looks like there’s some extra hoops you have to jump through to configure that. And also convert all my .htacess files into clauses in the nginx configuration files.

Sigh, this is going to be a full in yak shaving exercise, isn’t it? I just wish the pain killers I take to be able to sleep at night didn’t leave me dizzy and disoriented all day, or that they actually killed the pain instead of just knocking me out.

Well that aint good

Further to Scraping a third party site:

Woke up this morning to an email from Linode saying my linode server has been running at 200% CPU. Logged in, and sure enough CPU is high, and there are a bazillion “firefox-esr” processes running. Did a kill-all on them, and the CPU immediately dropped to reasonable numbers. There was still a “geckodriver” process running, and when I killed that it went into zombie state instead of going away. I had to do an /etc/init.d/apache2 reload to make that one go away.

Did some tests, both from the nightly “scrape the whole damn site” cron job and the web site’s “scrape one flight plan when the user clicks on that flight plan” and I’m not currently seeing any orphaned firefox-esr or geckodriver processes. So it appears that when I scrape, I correctly close the Webdriver connection which correctly stops the firefox and geckodriver processes.

So I guess I need to keep an eye on this and see if I can figure out what the client is doing to make the website fail to close the Webdriver connection. Or maybe I left some turdlets around on the system when I was doing testing? I don’t know.

Scraping a third party site

For my sins, I wrote a website for a friend’s company that relies on scraping information off another company’s website. The company I’m doing this for does have a paid account on the third party’s website so there’s nothing ethically dubious going on here – I’m basically taking off information my clients had put into the third party site.

I couldn’t figure out the third party site’s authentication system, so instead of pulling in a page and parsing it using BeautifulSoup, I use Selenium to attach to it like a web browser.

The third party site, however, is utterly terribly written. It’s full of tables nested within tables, and missing closing tags and everything else that reminds you of the old “FrontPage” designed sites that only worked on IE. They don’t consistently use ids or names or anything else to help me find the right bits of data, so I’ve had to wing it and parse things out using regular expressions all over the place. But worse is that every now and then they change things around a bit in a way that breaks my scraping.

The way I’ve been scraping in the past was I used the Selenium “standalone” jar, attaching to this java process that pretends to be a web browser without actually being a browser. Which is important, because I run the scraping process on my web server, which is a headless linode, and like most web servers doesn’t even have X11 running on it. (Some components of X11 got installed on it a while back because something needed something that needed something that needed fonts, and voila, suddenly I’ve got bits of X11 installed.)

This method has worked great for several years – when I’m debugging at home I use the version of the Selenium webdriver that fires up a Chrome or a Firefox instance and scrapes, but then when it’s working fine I switch over to the version that connects to a java -jar selenium-standalone.jar process. I don’t know what the official term is, so I’m just going to call it “the selenium process”.

A couple of years ago they (the third party) made a change to their website that caused the selenium process to die with JavaScript errors. Like I said, their website is crap, and I shouldn’t be surprised it has crappy JavaScript. Fortunately at the same time they introduced these JavaScript errors, they put a big “submit” button on the page that would go to the selected page even if you disabled JavaScript, and so that’s what I did with my scraper back then.

Flash forward to now, and they’ve changed the site again. They still have the broken JavaScript that freaks out the selenium process, but now you can’t navigate the site if you turn off JavaScript. So I tried turning on JavaScript in my scraper and the selenium process, and as expected from 2 years ago it failed spectacularly. So I tried updating the selenium process jar, and it doesn’t even connect at all – even though my Python selenium API version number and selenium process jar version number are the same (3.141.59, I had been using selenium jar version 3.12.0 before). I did some googling and found the names of the arguments had changed a bit, so I changed that and I still couldn’t get anything working.

I tried a bunch of different ideas, and followed a bazillion web links and tried a bunch of things from those places. Nothing worked. Eventually I had to give up and install Firefox on my web server, and an optional piece of the selenium api called “geckodriver” that launches Firefox. Fortunately selenium knows how to launch Firefox in a headless manner (although installing it did drag in even more bits of X11 that I don’t actually want or need). That actually worked on the site, after I figured out how to put the geckodriver file somewhere on the path and get the geckodriver.log file put somewhere useful. But I’ve got it done for now. Until the next gratuitous change.