Tuesday, July 27, 2004

MSN Newsbot review

Microsoft has launched their personalized news product in the US. How does it compare to Findory News?

Using MSN Newsbot provides enough data to do some educated speculation about the system. MSN Newsbot does instantly keep track of articles read and use them immediately to change the small box of "Personalized News" headlines in the upper right of the front page. Aside from the small box of personalized headlines, the rest of the page appears to be unpersonalized. Some articles I read did not appear to be recorded. For example, I did a search on "newsbot" and clicked on three articles, only to find that none of the articles were recorded in my history or used for personalization. I also managed to lose my entire history once for no reason I could discover.

It's difficult to determine the underlying algorithms from inspecting the behavior of the site, but there appears to be strong evidence that it is mostly based on subject categories. Reading an article on business in Korea caused top headlines from Asia to appear. Surprisingly, even after deleting the article from my history, Asia top stories continued to be selected. Clicking on the "Why?" link gave the explanation that the personalized stories were picked because they are from the category "Asia-Pacific Latest".

Similarly, reading an article on Google's IPO caused the personalization to show me more top headlines from "Business:General" and "Business:Financial". Reading a science article on the effects of caffeine just produced more general science articles.

Using subject-based profiles is a well-known method of doing personalization, but it also has well known problems. In particular, the personalization is not specific -- for example, showing just general business headlines -- and tends to pigeonhole people -- showing a reader only business stories and not picking up other cross category interests. While it does have the advantage of being simple, experience in my past life shows that the predictive accuracy of this method is an order of magnitude lower than more fine-grained personalization techniques.

If it is true that MSN Newsbot is merely using subject classifications for its personalization, Findory's personalization technology is considerably more advanced. Findory's algorithms combine statistical analysis of the article text and of users who viewed the articles with information about articles you previously viewed. Our personalized news is fine-grained. Our personalization is targeted closely to your interests while maintaining enough serendipity to enhance discovery. We help you read the news more efficiently and find articles you otherwise would miss. There is still nothing else like it out there.

3 comments:

w.whitmont said...
This comment has been removed by a blog administrator.
Rational Islamophobe said...

Hey Greg,

Findory looks damn cool!

I have a suggestion for you: Why don't you make one or more "skins" for Findory? Specifically, make a skin that mimics google in every extent (maybe even to the extent of search, although to make money I imagine you will need to use your own search, not google's. You don't have to do away with your main page, suck them in first with the news offering.

e.g. gskin.findory.com

There are people who would switch right away if they knew about it.

Oh, one more thing: who is the moron responsible at amazon for the crap "sort by rating" algorithm they have there??? I for one do not think one 5 star rating has any significance whatsoever, yet they will put that first over a 4.5 star work with 500 votes!

They need to copy imdb's rating algorithm (I have a post about this that talks about that on my page) here:

weighted rank (WR) = (v ÷ (v+m)) × R + (m ÷ (v+m)) × C

where:
R = average for the movie (mean) = (Rating)
v = number of votes for the movie = (votes)
m = minimum votes required to be listed in the Top 250 (currently 1250)
C = the mean vote across the whole report (currently 6.8)

My blog has more on this.

Rational Islamophobe said...

http://www.dailypundit.com/newarchives/001390.php#001390