Tuesday, January 15, 2008

LambdaRank, RankNet, and MSN Search

Microsoft Researchers Ping Li, Chris Burges, and Qiang Wu wrote a June 2007 technical report, "Learning to Rank using Classification and Gradient Boosting", that follows up on the RankNet paper. That older paper was particularly interesting because RankNet apparently made it live into the MSN Search engine.

The goal of this kind of work is to automatically learn what makes a good relevance rank for web search, certainly fun and useful stuff. In this case, the learning is from hand-labeled training data, but there have been other attempts to learn from clickstream data.

After reviewing the older RankNet paper, I said I was surprised to see a neural network used and that it was hard to tell how effective RankNet is because it is was not compared to other approaches. This 2007 tech report fixes that, evaluating the newer LambdaRank NNet against three other non-NNet approaches, but unfortunately reports (in Section 6.2) that the NNet had relatively poor performance.

One thing I am curious about in the paper is whether what they are optimizing for is really well correlated with user satisfaction. It would seem to me that their particular definition of discounted cumulative gain (as described in Section 2) would underweight the value of the first search result and overweight the value of later results relative to other possible definitions, especially for results below the fold on the page. It is worth pointing out that their measure differs from the metric used by Huffman and Hochster which much more heavily weighs the first few results (especially for navigational queries).

By the way, I have to admit, I found this 2007 tech report quite difficult to read. It may have just been me but, if you decide to attack it, you probably should be prepared to put in some effort.

If you are interested in this, you might also be interested in my June 2006 post, "Beyond PageRank, RankNet, and fRank", which looks at some similar work by different researchers at MSR.


Anonymous said...

I believe your interpretation of NDCG is not quite right. The "discount" in NDCG penalizes the score for relevant documents further down the list. In this paper, this discount is: c=1/log_2(1+i), where i is the rank of the document. So, the document at the top of the ranked list (i=1) would receive a discount of 1.0, at rank i=2 a discount <1.0, etc.

Greg Linden said...

Hi, Jon. Sorry, I was not clear in my post. I meant "underweight the value of the first search result and overweight the value of later results" only in comparison to other possible definitions of DCG.

I'll make a correction. Thanks for pointing out the issue.