tag:blogger.com,1999:blog-6569681.post6894248581934232934..comments2024-01-15T13:17:33.771-08:00Comments on Geeking with Greg: Ranking using Indiana University's user trafficGreg Lindenhttp://www.blogger.com/profile/09216403000599463072noreply@blogger.comBlogger13125tag:blogger.com,1999:blog-6569681.post-40916300730011599482008-02-22T09:30:00.000-08:002008-02-22T09:30:00.000-08:00Are there any alternative methods for ranking, ins...Are there any alternative methods for ranking, instead of traditional page-rank and its variations? May be some alternative designs help in reducing link spam by a whole lot.Shirishhttps://www.blogger.com/profile/12091474325051553432noreply@blogger.comtag:blogger.com,1999:blog-6569681.post-31940253846742524562008-02-22T03:42:00.000-08:002008-02-22T03:42:00.000-08:00Very interesting paper, in particular for the mass...Very interesting paper, in particular for the massive scale of evaluation.<BR/><BR/>I saw a related paper in the past "Can link analysis tell us about web traffic?", 14th international conference on World Wide Web<BR/><BR/>Also a related patent " Sampling Internet user traffic to improve search results"Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-6569681.post-47066562539071878942008-02-18T21:06:00.000-08:002008-02-18T21:06:00.000-08:00Certain ISPs strip the referer from all their outg...Certain ISPs strip the referer from all their outgoing traffic, not sure why.Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-6569681.post-67436740188676523092008-02-18T08:43:00.000-08:002008-02-18T08:43:00.000-08:00Hi, Matt. On the long tail and PageRank not doing...Hi, Matt. On the long tail and PageRank not doing well there, this paper seems to indicate that PageRank performs much better on the long tail, at least if you measure performance by the correlation between PageRank and traffic to the pages. Figure 9 and the discussion in Section 6.1 is what I am referring to.<BR/><BR/>But, on your broader point, I agree that PageRank is more of a brand at this point than the key algorithm for Google's relevance rank. Saul Hansell at the NYT <A HREF="http://glinden.blogspot.com/2007/06/perils-of-tweaking-google-by-hand.html" REL="nofollow">reported</A> that Google has "many thousands of interlocking equations" that they have manually tested and added to their relevance rank. If PageRank still exists somewhere under all that, its influence almost certainly is limited.Greg Lindenhttps://www.blogger.com/profile/09216403000599463072noreply@blogger.comtag:blogger.com,1999:blog-6569681.post-81595000473369289722008-02-18T08:37:00.000-08:002008-02-18T08:37:00.000-08:00Here's a relevant link re: Kaltixhttp://www.news.c...Here's a relevant link re: Kaltix<BR/><BR/>http://www.news.com/2100-1024_3-5061873.htmlAnonymousnoreply@blogger.comtag:blogger.com,1999:blog-6569681.post-67012586040337088292008-02-18T08:34:00.000-08:002008-02-18T08:34:00.000-08:00matthewhurst: But didn't Google buy the Stanford s...matthewhurst: But didn't Google buy the Stanford startup company Kaltix in 2003? From the papers I've read, I think they were working on a personalized version of pagerank. That is, they were moving away from generalized web search into personalized web search. (This is what rob diana was alluding to, I believe). So now they've had 5 years to make PageRank personal, right?Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-6569681.post-22559865705641392062008-02-17T21:03:00.000-08:002008-02-17T21:03:00.000-08:00Greg, It is often overlooked that PageRank only re...Greg, It is often overlooked that PageRank only really claims to be good for very general search queries - certainly not for the long tail (I believe the original paper states this explicitly). I tend to think of PageRank now more as a brand than an elegant algorithmic innovation.Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-6569681.post-46419672574148150992008-02-17T04:48:00.000-08:002008-02-17T04:48:00.000-08:00Thanks for the nice post and also all the comments...Thanks for the nice post and also all the comments make good points. Just a note in response to Mike Dierken and dp-maxime: IU (where the traffic data is collected) does not have a firewall or proxy server in place. Also, although as you (and a questioner at the conference) point out, users can turn the referer off, we estimate this to be a small fraction of the about 100,000 users in our sample. Cheers!Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-6569681.post-71239615462778800192008-02-16T06:32:00.000-08:002008-02-16T06:32:00.000-08:00Very interesting analysis and summarization of the...Very interesting analysis and summarization of the paper. The problem that I see with a lot of the PageRank vs. Traffic ranking vs. any other metric is that they assume all users are the same. This really needs to be broken into various types of users to determine what each segment acts like. For example, a highly technical crowd tends to use Firefox, uses rss or some other feed as a referral source and types urls directly. Something like that could be very useful.Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-6569681.post-51761478763850289802008-02-16T04:30:00.000-08:002008-02-16T04:30:00.000-08:00This was one of my favorite papers to. A very imp...This was one of my favorite papers to. A very impressive experiment at a HUGE scale. <BR/><BR/>Although their results are very interesting, I'm not sure they're comparing apples to apples. PageRank is not intended to be a predictor of traffic or popularity. Its intended to measure authority -- not the same thing. I'm not sure what this implies as far as the utility of PageRank in web ranking. <BR/><BR/>By the way, it was good meeting you at WSDM. Best of luck at LiveLabs.Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-6569681.post-16912922049382881822008-02-16T04:18:00.000-08:002008-02-16T04:18:00.000-08:00More likely, empty Referrer mean that it was cut o...More likely, empty Referrer mean that it was cut off by a firewall or by a proxy server, or simple was turned off by user (some browsers allow users to do so).Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-6569681.post-69644206984767252302008-02-16T01:25:00.000-08:002008-02-16T01:25:00.000-08:00Interesting post.And what about searching problems...Interesting post.And what about searching problems in microformats like twitter.What main details?Scabrhttps://www.blogger.com/profile/10999467778061442007noreply@blogger.comtag:blogger.com,1999:blog-6569681.post-21865358713986651552008-02-15T23:09:00.000-08:002008-02-15T23:09:00.000-08:00"They noted that 54% of traffic had an empty refer..."They noted that 54% of traffic had an empty referrer, which may suggest that browser bookmark usage [...]"<BR/><BR/>There are also firewalls and client proxies such as Norton Internet Security that strips the Referer request header, so that may contribute to the statistic.Mike Dierkenhttps://www.blogger.com/profile/02406913273929110651noreply@blogger.com