Thursday, 28 January 2010

Search privacy: GoogleSharing, StartPage review & overview






This post is about GoogleSharing, which helps improve your privacy when you're doing a search on Google, and StartPage, a privacy-preserving search engine. Very topical for Data Privacy Day, which is today!

Google currently (since August 2008) keeps your search data for 9 months. Microsoft have just reduced their data retention period for IP addresses to 6 months for their Bing search engine, while Yahoo only keeps data for 3 months.

As the AOL search data debacle has shown, it's not that hard to identify people from their search data, and of course find out more about them based on what they're searching for. Plus, the US government might be able to demand your data from Google etc without a court order, due to the sweeping powers it has under PATRIOT Act, given that their servers are in the USA.

GoogleSharing for search privacy

The new GoogleSharing service hides your Google search data from Google, or more accurately obfuscates it by essentially mixing up your Google search activities with those of other Google Sharing users. It's free, but do consider donating to help keep it going, if you use it.

That way, Google can't tell (and track) which IP address is whose, thus providing a degree of anonymization.

It acts as what's called a proxy server (an intermediary between your browser and the ultimate destination site), and furthermore, on each use of a Google service GoogleSharing assigns you a different "identity" as far as Google is concerned, and GoogleSharing regularly "injects" false search requests for all the identities in order to obscure things even further. You don't get any cookies from Google.

And GoogleSharing uses https for the connection between your Firefox browser (with GoogleSharing) and the GoogleSharing server, which makes your search much more secure from third parties spying on it e.g. over wifi.

Scope - what services does GoogleSharing work on?

It works for all Google services where you don't have to login to use it e.g. currently Google search, Google Images, Google Maps, Google News, Google Video, Google Products, Google Finance, and viewing Google Groups - but not Gmail or Google Calendar, etc.

Also Google Translate, according to Moxie Marlinspike of Thoughtcrime, the brains behind GoogleSharing.

How to get GoogleSharing?

It only works on the free Firefox browser.

  1. Download Firefox if you haven't got it already (at least for better security than Internet Explorer - that vulnerability's since been patched but more have emerged - and the ability to have better browser security generally).

  2. Then install the GoogleSharing extension (but it's still an experimental add-on so you have to tick the "Add to Firefox" box before you can install it, and if all goes haywire you've been warned!).

How to use GoogleSharing?

Once you've installed the add-on and restarted Firefox, that's it. You'll see "Google Sharing Enabled" in your Firefox status bar (grey bar at the bottom of the browser window):

As long as you use Firefox as your browser when using Google services, it just does the identity mixing up behind the scenes; you don't have to do anything else after you've installed it. As GoogleSharing say, it's completely transparent to the user, just install and forget about it. Searches may be a teensy bit slower, but not noticeably so in my experience anyway.

UPDATE: in case you wonder, it also works (once installed) if you search via the Google Toolbar instead of the Google webpage.

Note for non-techies: even if you've installed GoogleSharing and have Firefox open, if you then use a Google service via Internet Explorer, Safari, Opera or Chrome i.e. any browser other than Firefox, your IP address and usage will still be tracked.

Clicking on the "Google Sharing Enabled" button pictured above allows you to disable GoogleSharing should you want to (and clicking again re-enables it). Rightclicking similarly allows this, and also lets you set its Options:

- such as "Edit proxy", where you can choose the language and also stop it from anonymising selected Google services if you wish:

What if you're logged in to Google in one tab (e.g. for Gmail), and you do a Google search in another tab?

Your Gmail, Google Calendar etc still works but your data won't be anonymised.

However, your Google searches in a different tab will still be anonymised -Moxie Marlinspike kindly confirmed that to ACE (and will be updating the GoogleSharing FAQs to clarify that point).

Won't GoogleSharing capture our search data instead?

Your search requests and other requests to Google services get sent through GoogleSharing's server and back to you through them, that's true.

But they say they don't log anything, and they also say that if you are suspicious about them you can check out their code freely for yourself, and even run your own GoogleSharing proxy server or use the GoogleSharing server of someone you do trust, to do the anonymisation. (The Preferences page in the Options is where you can add another proxy server, if you prefer).

And they don't transmit login credentials or cookies.

Tips on other ways to protect your search privacy - StartPage etc

Clicking on Google search results

When you do a normal Google search and you click on any search result, Google knows exactly which links you've clicked on.

Though GoogleSharing will hide or mix up search results clicks too (UPDATE: Moxie has kindly confirmed this), see also my post on how to stop Google from knowing which search results you clicked on - again using a Firefox extension, which at least gets rid of the gunk in the URLs if you want to copy and paste a search result link to email to someone else etc.

Scroogle Scraper

You can alternatively search Google by visiting Scroogle, which again acts as an intermediary between you and Google, but doesn't add in fake identities (but note it's Scroogle.org, do NOT go to Scroogle.com, it's pron! Or maybe that warning's just sent some people there…).

You go to the Scroogle webpage rather than Google's to search, so you do have to search Google in a different way.

But a plus is that Scroogle works with all main browsers including Internet Explorer, not just Firefox, and there are plugins etc to help facilitate its easier use. Again, do consider donating if you use it.

Ixquick / StartPage

European-based meta search engine Ixquick, an ad-funded service, promises not to record your IP address (computer address) or your search terms at all, and have now got themselves an additional, more memorable, name: StartPage.com. They will search other search engines for you including Microsoft's Bing, but not Google.

Again you'll have more security from wifi interception of your searches if you use their https address to start searching from.

They have also just introduced (and see their YouTube video) a proxy service, so that when you click through on one of their search results, the website you visit won't know your IP address either. To repeat, it hides your IP address etc from the site you click through to, not just from the search engines it uses. GoogleSharing doesn't do that, it's not designed to; it provides some privacy protection for Google searches while not slowing you down too much.

Just click on the "Proxy" link in the StartPage / Ixquick search results to go to the site "invisibly" via Ixquick (but I think it should say "Hide me" or "Cloak my visit", not "Proxy", in order to avoid scaring off non-techies).

Not only that, but whenever you click a link on the site you visit via the "Proxy" link, StartPage should proxy that visit too, and keep "passing on" the webpages to your browser so that the site you visit only sees StartPage as the origin. You can tell when your visits are being "cloaked" because it'll still show StartPage's logo in a horizontal bar at the top of the page:

However, as you can see from the screenshot above, not all sites may work properly if you go to them via Ixquick, especially sites using the ubiquitous Javascript. Again, a proxied webpage visit involves more steps and will be slower than a direct "fully uncloaked" visit.

Miscellaneous - Google Web History; third party widgets

If you have the Google Toolbar, Google Web History (Web History Help links) stores on Google's servers the history of all website visits you make in the same browser while you were signed in to your Google Account, i.e. any log-in type Google service (e.g. Gmail). So you could think about pausing or deleting your Web History, or at least limiting it to store searches only. You'll see the Web History link at the top right of the Google search page. The ultra paranoid might suggest that when you "stop storing your web activity in Web History", you just don't see it anymore in your Web History page, but it's still stored there on Google's servers for them to analyse... I just don't know enough about what they do behind the scenes. Though Google have today published "privacy principles".

Moving on, be aware than it's not just search engines but websites generally, even blogs, that can collect info about you when you visit their webpage, especially if they have widgets which allow third party sites like Google or MyBlogLog to get info - see e.g. my blog's Privacy Policy about this blog's use of Google Analytics and other widgets etc.

How do you protect your privacy when visiting those sites? That's another whole huge area, and beyond the scope of this blog post.

Most people already know about cookies and not saving or deleting them.

Visiting the site through a proxy, e.g. via clicking StartPage's search results "proxy" link, should also help.

I don't know enough yet about proxy servers to know if StartPage's proxy (or other proxy servers) will strip out identifying "browser fingerprint" information like the User Agent header your browser sends out, though; I'd have hoped so. That's right, even if they don't know your IP address, websites can still identify you from your individual browser configuration, and it seems many of them are doing so. UPDATE: Moxie confirms that GoogleSharing will not leak your browser's User Agent info to Google.

What about other Google services like Google DNS? Goo.gl?

In a later post I'll deal with privacy risks with other other steps you can take to improve your privacy when using Google services, and some other issues with privacy and Google generally.

There ain't no such thing as a free lunch. A search engine like Google provides its Web services for free, to consumers anyway (though it's started to get paid directly for things like enterprise search tools and Google Apps).

It doesn't charge users but makes most of its money from ads, as is well known. So in return for giving us free search, Gmail etc, Google serves up "targeted" ads; the more relevant the ads are to us, the more likely we are to click on them and make money for Google. It's a trade off, and it's fair enough that they should get something out of it.

But it's natural that Google and other web businesses will want to get as much info out of us as possible, and keep the data for as long as possible. Conversely, many users want more control over what personal data of theirs is gathered and how it's used, and more information about what happens to their data afterwards (buzzword "transparency").

Or, perhaps, users sense that their data is valuable (which it is, of course) and want more than just free searches in return for giving it up, they want more say about the terms on which they're letting their personal information be collected and used (e.g. a share of the money?) - especially as once personal data is out there on the big bad interwebs, it's out there forever.

The two desires seem diametrically opposed. The biggest and toughest question is, where should the balance be struck, and how do we find the right balance? We're clearly not there yet.

2 comments:

Anonymous said...

Google's "privacy Principles" and "Privacy Center" are worded in a manner that speaks only about personal information you voluntarily give to Google, and carefully skirts any mention of the information they archive for their own purposes from your use of their services. Google is also completely silent in their Privacy Center about their use of History or waht happens if you opt out of the colleciton of History. Given the draconian nature of Google's pronouncements and actions regarding privacy, there is every reason to believe that whether you have your History showing and a convenience for yourself, or not, Google continues to collect it for their own purposes and subject to subpoena from the Government, the police & prosecutors, your creditors, and your wife's divorce lawyer. In a CNBC interview a few weeks ago, Google's CEO showed absolutely no concern about the privacy of its customers by stating: "If there is anything you wouldn't want known, then you shouldn't be searching for it." That sort of cavalier attitude does not inspire confidence in the good-will of Google.

Anonymous said...

Since google generates income through the publics 'clicking', It would be really interesting to see how many people would still complain of privacy issues 'If' they were to get a 50/50 split on the profit generated in co-op with google. Just an thought.