An excerpt on learning to rank from searcher behavior:
Our goal is very simple: We want to return to the user the answer that they need.An excerpt on personalization:
The results we show you are based not only on what we know of the Web, but also what other people have searched for .... Signals from people are the best signals.
If you allow us to keep your Web history, we will improve your search ... in two ways. One is, we will tune the result for you slightly. We're not going to change the whole page -- we might change position 5 to position 3 here and there, but we'll use whatever we can from your previous searches to adapt the current search to you.And, an excerpt on how improving overall relevance requires accumulating many small twiddles that apply to specific types of searches and needs:
The second is, we allow you to search within your Web history ... You may remember something you did three months ago and you don't remember exactly how you did it.
Last year we made over 450 improvements to the algorithm .... We have to find what weakness in the algorithm caused that result and find a general solution to that, evaluate whether a general solution really works and if it's better, and then launch a general solution.There is a re-occurring theme in Udi's comments, learning from searcher's behavior. Google learns from what people searched for. Google learns from a searcher's history. Google learns what works and what does not from what people click on. Google learns from what people do when they use Google.
I'll give you an example of something that came last week. We were evaluating a certain algorithm that adds diversity to the result. We did live experiments, which means we launched the algorithm to a very small percentage of users and then see how that compares to the result without the algorithm.
One of the queries that made a difference: The query was, New York Times address ... The first result right there on the snippet gives you The New York Times. It turns out that's not what the user was looking for. They were looking for an address given out by a New York Times reporter the day before. And because of this diversity and because of our emphasis on freshness and highlighting fresh results, that particular address appeared somewhere in the results, and that's what the user wanted -- that's what they went to and got the result.
That was something that surprised even us. You don’t think that when someone searches for New York Times address that they’re not looking for the address. Language is like that. Intention can be ambiguous.
For more on that, please see also my previous post, "Actively learning to rank".
Please see also my earlier post, "The perils of tweaking Google by hand", which talks about whether these thousands of twiddles to the search engine and variations of them probably should be constantly tested rather than just evaluated at the time they are created.