How did Google find that?

Google has a blog post showing how they set up some fake search results, and then a short time later Bing started returning the same fake results, and therefore they suspect IE8’s “Suggested Sites” and/or Bing’s “Customer Experience Improvement Program” is spying on what you click and sending the results off to Microsoft.

But before Google gets all high and mighty, I want to tell you about what happened to me. I did some documentation for a customer I was doing some work for. I did it in the form of a TiddleyWiki and stuck it up on a brand new, never used before subdomain of my main domain. Well, she hated it and asked that I do it as a Word document instead, which I did. But I forgot to take it down. No problem, I thought, after all nothing links to it or mentions it in any public place, so how would a crawler find it?

Imagine my surprise when the customer calls me up some time later saying that this old version of the documentation, in a subdirectory on a un-linked to site is showing up in Google searches for her product’s name. How did that happen? Using the advanced search, I couldn’t find anything that linked to it. There was one mention of that domain in a forum post, but in that case I was using the :8080 port because I was referring to the Tomcat server that was also running on that domain.

So as I see it, the choices are:

  • Google saw the mention of the domain in the middle of a forum post, recognized it as a URL (it wasn’t a link) and stripped out the :8080 and crawled the site OR
  • They saw me mention the url in a link I send in a GMail to the customer and used that as an excuse to crawl the site.
  • IE reported the link to Bing when the customer clicked on it and then Google stole it from Bing somehow
  • Chrome reported the link to Google when I clicked on it

Either way, they’re crawling things that aren’t public links. Me thinks Google protest too much.

Back to the Mac. (Sfx: sigh of relief)

My laptop has been in the shop because it couldn’t connect to wireless networks with any sort of consistency. It wasn’t preventing me from doing my work, but considering I have surgery next week and my AppleCare expires in two months, I figured now was the time. So for a week now, I’ve been using Linux as my desktop. I’m extremely glad to be back to the Mac.

But not really because of anything wrong with Linux. As soon as I started using the desktop on the Linux box, it told me I should upgrade from Ubuntu 8.04 LTS to Ubuntu 10.4 LTS, which was time consuming, but afterwards a persistent problem I’d had booting any kernel newer than the one I’d installed it with went away, and it recognized my Wacom Bamboo which it hadn’t before. I had to struggle a bit to get my VPN set up, and it was a struggle to get it to treat the “Caps Lock” key as a control key. And because my Linux box is a server, I’d originally set it up with XFCE4 instead of KDE or Gnome, so it wasn’t as functional and beautiful as it could have been.

No, the real reasons I was glad to be back on the Mac are because:

  • The Linux box doesn’t have speakers or a microphone, so I had to set up Skype on a netbook, which made for fun when somebody sent me a file or a url.
  • The cut and paste functionality is quite different in Linux, and required some getting used to. It wasn’t very consistent between apps.
  • The RDC client I was using on Linux didn’t translate the local printer so that it appeared as my default printer on my Windows session like the Mac RDC client does.
  • I couldn’t figure out how to switch desktops with a keystroke, especially not when I was RDC’ed into work.
  • Without my Mac, I couldn’t listen to my podcasts, and I couldn’t pay bills.
  • Most importantly: because my laptop has a 17″ 1920×1280 screen, and I also plug it into this 20″ 1080i screen, but with the laptop gone I only had the one screen to use for Linux, I felt very hemmed in.

So I’m glad to be back. But I’m having to retrain my fingers for cutting and pasting with Command instead of Control.

Today’s discovery about Google Chrome

If you use Google Chrome as your web browser, right click on the address bar, and choose “Edit Search Engines”. You’ll discover that every web site you’ve ever been to with a search box, including this blog, installed as a “Search Engine”. And you can search that site by typing the domain name (like blog.xcski.com) followed by a space followed by your search terms. (I haven’t tested to see if “Clear Browsing History” clears this as well, but if it doesn’t, that might be a surprise if you think you’ve cleared your tracks)

But another interesting use of this is that you can change the short cut. So if I double click on the entry for Wikipedia, and change the “Keyword” from “en.wikipedia.org” to “wiki”, I can search Wikipedia by typing command-L to highlight the current address in the address bar, then typing “wiki Stephen Fry” and hitting return, and going directly to the Wikipedia page about Stephen Fry.

Lifehacker has an article about some other ways you can use this Search Engine capability to be able to do things like enter a Google Calendar event from the address bar.

Oh, I give up

I’ve been reading Slashdot since the very late 1990s. But as of today I just unsubscribed from its RSS feed. It is probably long past time. The comments have been unreadable for 4 or 5 year now, but I kept up with it because the articles had some merit, or at least most of them did.

But today was the last straw – first there was an article that was going all hysterical about the changing of the runway numbers at Tampa airport, suggesting it had something to do with the recent bird deaths, and/or with the ridiculous and easily discredited “Pole Shift” rapid magnetic pole shifting hypothesis. Then if that wasn’t too crazy, there was another story, this time asking what equipment would be useful in a ghost investigation. So much for “News for Nerds, Stuff That Matters”.

Sorry, I know a number of people who believe in ghosts, and people who watch those ridiculous “ghost investigation” shows, but it’s neither news nor technology.

Computer problems

My laptop’s Airport (wifi) is kind of flakey. It reports a good connection and gets a proper IP from DHCP, but then stops being able to talk to the rest of the network. I’m able to keep using it because while it’s at my desk I can plug it into the wired network.

But the laptop is still covered by AppleCare, at least for the next 60 days. So I should bring it in to get it fixed soon. But I can’t leave it at Apple because every day it’s away from me is a day I can’t work. So before I can take it in, I have to figure out how to do my work from my old Linux box, or if that doesn’t work, where to borrow a Mac to work on.

Let’s see, to do my work on a Linux computer, I’d need the following things that I don’t currently have:

  • speakers (I think I have some around, I’ve just never configured them)
  • microphone (I’ve never installed one on Linux, that could be tricky)
  • Skype
  • Dropbox
  • Chrome
  • Remote Desktop Client (if such a thing exists and works)
  • VirtualBox and a Windows environment (which might take care of the Remote Desktop Client)

That will probably be enough to get me going. But it’s obviously not as nice as having my own MacBookPro, or even a loaner machine that I’ve cloned my TimeMachine backup onto.