Thursday, February 03, 2011

Google, Bing, and web browsing data

I suppose I should comment, as everyone else on the planet has, on Google's claim that Bing is copying their results.

My reaction is mostly one of surprise. I am surprised that Google wants this issue discussed in the press. I am surprised that Google wants this aired in the court of public opinion.

Google is trying to draw a line on what use of behavior data is acceptable. Google clearly thinks they are on the right side of that line, and I do too, but I'm not sure the average searcher would agree. And that is why Google is playing a dangerous game here, one that could backfire on them badly.

Let's take a look at what Google Fellow Amit Singhal said:
This experiment confirms our suspicion that Bing is using some combination of:
  • Internet Explorer 8, which can send data to Microsoft via its Suggested Sites feature
  • the Bing Toolbar, which can send data via Microsoft’s Customer Experience Improvement Program
or possibly some other means to send data to Bing on what people search for on Google and the Google search results they click.
Of course, what Amit does not mention here is that the widely installed Google Toolbar and the fairly popular Google Chrome web browser sends very similar data back to Google, data about every page someone visits and every click they make. Moreover, Google tracks almost every web search and every click after a web search made by web users around the world, since almost every web search is done on Google.

By raising this issue, Google very publicly is trying to draw a particular line on how toolbar and web browsing data should be used, and that may be a dangerous thing for Google to do. The average searcher, for example, may want that line drawn somewhere other than where Google might expect it to be drawn -- they may want it drawn at not using any toolbar/Chrome data for any purposes, or even not using any kind of behavior data at all -- and, if that line is drawn somewhere other than where Google wants it, Google could be hurt badly. That is why I am surprised that Google is coming out so strong here.

As for the particular issue of whether this is copying or not, I don't have much to say on that, but I think the most thought-provoking piece I have seen related to that question is John Langford's post, "User preferences for search engines". John argues that searchers own their browsing behavior and can reveal what they do across the web to whoever they want to. Whether you agree or not with that, it is worth reading John's thoughts on it and considering what you think might be the alternative.

Update: In the comments, Peter Kasting, a Google engineer working on Chrome, denies that Chrome sends clickstream data (the URLs you visit) back to Google. A check of the Google Chrome privacy policy ([1] [2]) appears to confirm that. I apologize for getting that wrong and have corrected the post above.

Update: A few days later, search industry watcher Danny Sullivan writes, "In short, Google doesn’t occupy any higher ground than Microsoft, from what I can see, when it comes to using data gathered from browser add-ons to improve its own services, including its search engine." Whether you think Danny is right or not, his article demonstrates that Google was wrong in thinking they would easily win if they took this fight to the press.

7 comments:

pkasting said...

Greg, I work on Chrome, and you are dead wrong in your claim that Chrome sends clickstream data (or the list of pages the user visits, or similar) back to Google. We don't collect that kind of data even if users turn on the off-by-default "anonymous usage statistics" checkbox, and we never have.

Greg Linden said...

Peter, I got that wrong. I had heard a rumor that Chrome did keep that data, but the rumor appears to be wrong. I'll correct the post. Sorry about that, I should have made more of an attempt to verify the information before posting this.

pkasting said...

Thanks for the fast update, Greg!

Note: You may want to change an instance of "toolbar/Chrome" to "toolbar".

Greg Linden said...

Hi, Peter. I left that last toolbar/Chrome as is on purpose, since Chrome does send some limited data back, and that is part of a sentence talking about banning any use of any data. But, if you think it is misleading still, let me know and I'll change it. And, again, sorry for my mistake.

Michael Nielsen said...

The Chrome privacy policy is pretty vague: "When you type URLs or queries in the address bar, the letters you type are sent to your default search engine [Google, for almost all users] so the Suggest feature can automatically recommend terms or URLs you may be looking for."

Or, in other words, the URLs you type are sent to Google. It's sort of implied that this will only ever be used to make suggestions, but I don't find that very reassuring.

Very nice post --- seems dead on.

pkasting said...

@Michael Nielsen, we've written about this several times. A few references are http://blog.chromium.org/2008/10/google-chrome-chromium-and-google.html to describe more about typing in the address bar and http://googleblog.blogspot.com/2008/09/update-to-google-suggest.html for details of how Google logs that data.

There are also a number of other conditions which we impose in order to avoid seeing too much of your data, such as not triggering suggest for local files, URLs with queries, anything with a username, strings that you type quickly and hit enter, etc. If you'd like to know more, feel free to email me at pkasting@chromium.org.

Abhijeet said...

Your whole point saying "what if you collect user search data to improve user experience, Google does the same" seems to miss the fact, "what if you collect user's google search data to improve user's bing experience, you need to cite Google in your results".