Geekery – Rants and Revelations

Well, that was a waste of time and money

So our house network looks kind of like

-fiber-[Fiber Modem]-[Router/WAP]-[switch]-[security camera]
                                          -[living room drop]
                                          -[dining room drop]
                                          -\
       -upstairs-[switch]-[linux]
                         -[iMac]
                         -[Mac Studio]

The cables represented by dashes (except the fiber one) are supposed to be Cat5e, but they might only be Cat5. The switches are 10/100/1000 Mbps and all three of the wired computers are running 1000 Mbps. The new one, the Mac Studio, had to be coerced in that speed, because it didn’t seem to be capable of negotiating correctly. I wrote that off to the fact that the switch is several years old and came out before 2.5G/5G/10G Ethernet was a thing. The WiFi is 802.11ac (I think) although I never really saw even close to the theoretical speed out of it. I think most of the time Speedtest.net would show the computers on the wired connections getting somewhere between 600-800 Mbps and the wifi only getting maybe 40 Mbps on a good day. But that was perfectly adequate for most uses – I want the faster speed on my wired computers because I’m uploading big video files and running various servers.

A few months ago, Greenlight told me that they were updating my fiber to 2Gbps. At the time, because my whole network is 1 Gbps, I didn’t think much of it. But recently Vicki’s been complaining about the WiFi being too slow and dropping out whenever the microwave is on. Fair enough, my experience is that home routers is that they need to be replaced every 5 or 10 years, and this one was in that age range (I think.)

I’ve been looking for routers that can do at least 2 Gbps on the fiber “modem” side, and also on the uplink to the switch. I did some investigation, and couldn’t find one at a decent price point. But then a couple of days ago I found a review of “best routers” that seemed to be saying that one of the routers could do the trick. It also had WiFi 6 which should speed up some of the WiFi’d devices. I also found some switches that said that they could do 2.5 Gbps on all the ports.

I spent a very frustrating time yesterday trying to get it all set up. My first problem was I forgot that because I’ve got a static IP, I had to manually enter the WAN IP address and other stuff. I got it all set up, the switches were showing 2.5 Gbps from each other, from the new router, and from the Studio. Speedtest was showing the wired computers were making 900+ Mbps up and down, and WiFi on my iPad was just about as 2/3rds as fast.

One weird thing I noticed is that the link between the router and the fiber “modem” was still showing at 1 Gbps. I didn’t realize it at the time, but the review I read mislead me. The router is capable of 2.5 Gbps WAN, but only using something called “Dual WAN”, which I think means using the both the port labeled WAN and the first port labeled LAN to connect to the fiber modem. But since only that LAN port is 2.5 Gig, I’d be able to get data to the router faster, but it couldn’t go out faster to the upstairs wired computers. I suppose that might make the WiFi a bit faster? Also the fiber modem only has one port, so I’m not sure how to make use of the “Dual WAN”.

I tested port forwarding and all that stuff and it all seemed to be working just fine, when I decided to see if copying files between the 3 wired connections were any faster. (In retrospect, they probably wouldn’t be since 2 of the three computers are only capable of 1 Gbps.) Soon after I started the copy, the Linux computer completely stopped talking to any network. I rebooted and it still wasn’t talking. I fiddled with a bunch of configuration, but the only thing that got it working again was putting back the original 10/100/1000 Mbps switch upstairs. Not sure why that happened, or why switching the switch could fix things.

At that point, I was having trouble with my Mac Studio. The Ethernet kept dropping and coming back up. I looked in the hardware settings for the Ethernet port and it was set to Automatic instead of Manual like it was supposed to be. And every time I switched it to Manual and reconfigured everything, after I clicked “Ok” and then clicked “Show Details” again it was back to Automatic. I eventually gave up on that and just used my old Thunderbolt to Ethernet dongle I used to use with my work laptop when I had one. Actually after I woke up this morning I had an idea and deleted the Ethernet configuration and made a new one, and it worked fine.

Meanwhile I reconnected most of the other devices to WiFi, including the Roku, and right now with Vicki watching videos on one of her devices and me watching videos on the Roku, Speedtest is showing the WiFi speed down to 30-40 Mbps again. Sigh.

So honestly I think I’ve spent a lot of time and money, and given myself a terrible day of screwing around with things that used to work and then didn’t work, and now work again, and all for nothing really. And I still have to manually re-attach some of the Wyze cameras to the WiFi, which in one case will involve going up on a step ladder to push a reset button and show it a QR code.

So this is happening

I’m slowly ramping up to start a drone business. I have the website https://RochDrone.com/ with copious design help from Bob Raymonda, I have business cards (also designed by Bob), I’ve just filed a DBA, and I’m about to start advertising.

Funny aside here: Notice that the URL is https rather than http. I’ve got several websites hosted on my server, and I’ve been resisting for years getting certificates for them all and redirecting from http to https. I figured it would be a full weekend type job. But what I discovered once I decided that it was a major requirement if I’m going to run a business website on my server was that basically I had to install a script and run it, answering a couple of questions mostly with defaults it provided, and I was done in 20 minutes. And 10 of that was thinking it wasn’t working because I accidentally forwarded the wrong port on my router.

The goal of the business was originally was to get my neighbors to pay me to get a look up on their roofs to see if their gutters are full of leaves or they have a shingle lifting or an obvious leaking spot at a chimney or a vent pipe. I mean, I would have paid for that when I had contractors telling me my gutters were full and I should pay them hundreds of dollars to clean them. But when I talk to other people, they don’t seem all that enthused. Well, maybe when they see the ad they’ll come around.

Alternatively, there’s a lot of real estate listings that would be improved by some nice drone shots. Or maybe contractors who’d like a look at a roof before they start estimating. Or home owners who’d like before and after shots of what the contractor has done. Or maybe weddings or graduating classes who’d like a cool video group picture. Or (and this one I didn’t think of, but I got approached by two rappers) an overhead shot in a music video.

I never thought this would become a full time job. What I’m hoping for is for it to grow to the point where I’ve got one or two shoots a week. But first, I’ve got to drive traffic to my website. Having a link here on my blog couldn’t hurt.

I’m trying to decide if I want my drone stuff to move to a separate YouTube channel. I worry that going forward it’s probably going to be the only content on my channel and I don’t want to make a new channel and starve my existing one.

A weird thought

I had a weird thought the other night. There are a couple of programming tasks on my massive “to do” list that I figured I’d power through in the first few months of retirement before I started spending hours and hours alternating between training in my kayak and touring on bikes with my wife.

Well, life doesn’t always work out the way you intended and none of my todo items has been checked off because the pain that last year made it uncomfortable to sit for too long, and which was just on the “barely tolerable” end of things by the end of a normal length kayak race has now progressed to absolutely intolerable for even short stints at a desk chair or kayak. I’ve spent about 15 minutes total all winter in my erg, and haven’t even put my kayaks in the water. Normally by this time of the year I’d have 30 or 40 hours on the erg and about the same on the water. And I limit my sitting at my desk chair to short periods to deal with bills and paying taxes and the like. Even the library easy chairs are uncomfortable verging on painful these days.

But back to my weird thought. I have a new iPad. I can’t afford a new laptop. So I was thinking that for those programming tasks, what I might try is to install “code-server”, which is a hosted version of Visual Studio Code, on my linux server. This gives you the full power of a pretty extensive IDE available through an iPads web browser. I could try coding up one of those projects using that, maybe using the git integration to push the app to a free heroku instance for testing and debugging. I wonder if that’s doable?

Well, in order to find out, first I’d have to install code-server and make it available through my web server. Oh, what’s this, it appears you need to use nginx as your web server rather than Apache to do that. Well, no problem, I’ve been intending to make that switch to make it easier to use LetsEncrypt to put everything behind https like I should have years ago. Oh wait, one of my sites uses Perl fastcgi. Looks like there’s some extra hoops you have to jump through to configure that. And also convert all my .htacess files into clauses in the nginx configuration files.

Sigh, this is going to be a full in yak shaving exercise, isn’t it? I just wish the pain killers I take to be able to sleep at night didn’t leave me dizzy and disoriented all day, or that they actually killed the pain instead of just knocking me out.

Well that aint good

Further to Scraping a third party site:

Woke up this morning to an email from Linode saying my linode server has been running at 200% CPU. Logged in, and sure enough CPU is high, and there are a bazillion “firefox-esr” processes running. Did a kill-all on them, and the CPU immediately dropped to reasonable numbers. There was still a “geckodriver” process running, and when I killed that it went into zombie state instead of going away. I had to do an /etc/init.d/apache2 reload to make that one go away.

Did some tests, both from the nightly “scrape the whole damn site” cron job and the web site’s “scrape one flight plan when the user clicks on that flight plan” and I’m not currently seeing any orphaned firefox-esr or geckodriver processes. So it appears that when I scrape, I correctly close the Webdriver connection which correctly stops the firefox and geckodriver processes.

So I guess I need to keep an eye on this and see if I can figure out what the client is doing to make the website fail to close the Webdriver connection. Or maybe I left some turdlets around on the system when I was doing testing? I don’t know.

Scraping a third party site

For my sins, I wrote a website for a friend’s company that relies on scraping information off another company’s website. The company I’m doing this for does have a paid account on the third party’s website so there’s nothing ethically dubious going on here – I’m basically taking off information my clients had put into the third party site.

I couldn’t figure out the third party site’s authentication system, so instead of pulling in a page and parsing it using BeautifulSoup, I use Selenium to attach to it like a web browser.

The third party site, however, is utterly terribly written. It’s full of tables nested within tables, and missing closing tags and everything else that reminds you of the old “FrontPage” designed sites that only worked on IE. They don’t consistently use ids or names or anything else to help me find the right bits of data, so I’ve had to wing it and parse things out using regular expressions all over the place. But worse is that every now and then they change things around a bit in a way that breaks my scraping.

The way I’ve been scraping in the past was I used the Selenium “standalone” jar, attaching to this java process that pretends to be a web browser without actually being a browser. Which is important, because I run the scraping process on my web server, which is a headless linode, and like most web servers doesn’t even have X11 running on it. (Some components of X11 got installed on it a while back because something needed something that needed something that needed fonts, and voila, suddenly I’ve got bits of X11 installed.)

This method has worked great for several years – when I’m debugging at home I use the version of the Selenium webdriver that fires up a Chrome or a Firefox instance and scrapes, but then when it’s working fine I switch over to the version that connects to a java -jar selenium-standalone.jar process. I don’t know what the official term is, so I’m just going to call it “the selenium process”.

A couple of years ago they (the third party) made a change to their website that caused the selenium process to die with JavaScript errors. Like I said, their website is crap, and I shouldn’t be surprised it has crappy JavaScript. Fortunately at the same time they introduced these JavaScript errors, they put a big “submit” button on the page that would go to the selected page even if you disabled JavaScript, and so that’s what I did with my scraper back then.

Flash forward to now, and they’ve changed the site again. They still have the broken JavaScript that freaks out the selenium process, but now you can’t navigate the site if you turn off JavaScript. So I tried turning on JavaScript in my scraper and the selenium process, and as expected from 2 years ago it failed spectacularly. So I tried updating the selenium process jar, and it doesn’t even connect at all – even though my Python selenium API version number and selenium process jar version number are the same (3.141.59, I had been using selenium jar version 3.12.0 before). I did some googling and found the names of the arguments had changed a bit, so I changed that and I still couldn’t get anything working.

I tried a bunch of different ideas, and followed a bazillion web links and tried a bunch of things from those places. Nothing worked. Eventually I had to give up and install Firefox on my web server, and an optional piece of the selenium api called “geckodriver” that launches Firefox. Fortunately selenium knows how to launch Firefox in a headless manner (although installing it did drag in even more bits of X11 that I don’t actually want or need). That actually worked on the site, after I figured out how to put the geckodriver file somewhere on the path and get the geckodriver.log file put somewhere useful. But I’ve got it done for now. Until the next gratuitous change.

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31