Tuesday, July 11, 2006

Personalized search at SIGIR 2006

Update: My post is wrong, including the title. The first author of the paper, Eugene, contacted me to let me know I had misinterpreted the work.

This research work extends the use of document features in some earlier work on RankNet with new "user action" features.

However, every searcher sees the same search results in the system. Eugene said that "personalization is a very interesting direction for future research," but that this system "does not do personalization."

I apologize. I was too quick to link this paper with some of Susan Dumais' other work.

Original post:

Microsoft researchers Eugene Agichtein, Eric Brill, and Susan Dumais have a paper on personalized search, "Improving Web Search Ranking by Incorporating User Behavior Information" (PDF), at the upcoming SIGIR 2006 conference.

The paper reports some good successes with reordering search results based on each searcher's clickstreams:
We show that incorporating user behavior data can significantly improve ordering of top results in real web search setting.

We show that incorporating implicit feedback can augment other features, improving the accuracy of a competitive web search ranking algorithms by as much as 31% relative to the original performance.

The general approach is to re-rank the results obtained by a web search engine according to observed clickthrough and other user interactions for the query in previous search sessions. Each result is assigned a score according to expected relevance/user satisfaction based on previous interactions.
The basic idea is keep a history of clicked results for each searcher, learn which document features each searcher seems to prefer, and then bias the search results toward those features.

In table 6.1 in the paper, there is one result I find particular interesting. They got a lot of lift merely from increasing the rank of any document on which the searcher clicked in the past (referred to as "BM25F-RerankCT"). That is, without learning a user model at all, it helps quite a bit to simply favor what the searcher clicked on before.

Most of the paper focuses its time on methods for learning which document features are most of interest to a given searcher. It is a complicated approach that, I suspect, will struggle against the problems of sparse data and over-generalization.

Given the success of just increasing the rank of anything I clicked on before,
I would be interested in seeing more of a social filtering approach. They already increase the rank of search results I clicked on. Now, also increase the rank of search results people like me clicked on. Help me find what I need using what others have found.

By the way, if you like this MSR paper, you might also be interested in another paper by the same authors to appear at the same conference, "Learning User Interaction Models for Predicting Web Search Result Preferences" (PDF). That other paper spends more time on learning what document features a searcher seems to like as well as exploring some ideas around reranking search results around a previously clicked search result.

2 comments:

jeremy said...

I'll have to read the paper to get the full answer.. but what is the difference between what they've done now, and the decades-old technique of relevance feedback? Are they somehow doing more intelligent feature selection?

jeremy said...

So it sounds like they're doing some sort of aggregation of (implicit) relevance feedback, eh?