Friday, August 10, 2007

Effectiveness of personalized search

Zhicheng Dou, Ruihua Song, and Ji-Rong Wen from Microsoft Research had a great paper at WWW 2007, "A Large-scale Evaluation and Analysis of Personalization Search Strategies" (PDF).

There are three major conclusions in the paper: (1) Personalization only helps on some queries. (2) Both long-term and short-term history are important for personalization. (3) Profile-based techniques do not work as well as more fine-grained, click-based techniques.

On how personalization helps only on some queries, the key concept here is "click entropy", the amount of variation in the search results searchers click on.

For queries that are not ambiguous, the top result already may be ideal. If almost everyone clicks on that result, that query's click entropy is low. At least for these queries, there is little opportunity to improve the ranking using personalization (or, for that matter, using any other technique).

Thus, the authors conclude, "Personalized search has different effectiveness on different queries," and, "Click entropy can be used as a simple measurement on whether the query should be personalized."

On long and short-term history, the question here is whether personalization should focus on what you are doing right now (the last few searches) or your general interests (everything you have ever done).

The authors conclude that "the incorporation of long-term interest and short-term context can gain better performance than solely using either of them."

On profile-based versus click-based techniques, profile-based techniques record the high-level categories of interest for each searcher while click-based techniques modify the ranking of past and related clickthroughs.

The authors conclude that click-based techniques "work well", but profile-based "improve the search accuracy on some queries, but they also harm many queries" and are "not as stable as click-based".

These conclusions are not far from what I advocate on this blog. I pick on Google's personalized search for primarily being a profile-based technique focused on long-term history ([1] [2] [3]) and argue for a fine-grained, click-based approach. This MSR paper judges click-based techniques more effective, but also suggests that combining click-based and profile-based algorithms and using both long and short-term history may yield the best results.

See also my Sept 2006 post, "Potential of web search personalization", where I talk about a KDD 2006 paper out of Yahoo Research that comes to some of the same conclusions as this MSR paper.


Anonymous said...

I pick on Google's personalized search for primarily being a profile-based technique focused on long-term history ([1] [2] [3]) and argue for a fine-grained, click-based approach.

And to beat my usual drum, the ultimate short term, fine-grained, click-based approach is the 30-year-old notion of relevance feedback.

It is really surprising to me that it has taken us so long to come back to where we began, where we should have been all along.

I think one of the first things I ever wrote on your blog, Greg, was a pointer to a paper from 2003 where an old colleague of mine had done profile-based personalization and found that it didn't work.

I'm not as good at keeping track of my old posts as you are, though. I'm guessing this is somewhere around April of 2006 that I made that comment?

Anyway, I'm not surprised.

Anonymous said...

Thanks for the links to these papers. This papaer Experimental Bounds on the Usefulness of Personalized and Topic-Sensitive PageRank from Web Intelligence 2007 has some interesting results too.