Thursday, March 31, 2005

A relevance rank for news and weblogs

If you use a feed reader like Bloglines, there must have been at least a few times you've looked at the overwhelming pile of unread articles with a sigh. So much to read.

All feed readers organize the articles in the same way. They group the articles by feed and sort the articles by date. So, you go through, click on each feed, skim the articles, and slog on through.

"Wouldn't it be nice," you've probably thought, "if these articles were sorted by relevance? Maybe the most important articles at the top and least important at the bottom? Then I could just read the articles from top to bottom, stopping when I get bored or run out of time."

That would be nice. But, what does it mean? What's the most relevant news?

Let's explore it. What if all the articles from the news and weblog feeds were sorted by how many people read them? The more people have read an article, the higher in your list of unread articles.

Hmm... That might help, but it'd be ordered by popularity, not relevance. Yahoo News has an example of ordering news by popularity. You can see that it tends toward the sensationalistic and tabloid. It pulls you toward the mainstream and away from the long tail. That's the wrong direction, folks. You want interesting and useful, not bland and mediocre.

Okay, if it's not most popular, what is the most relevant news?

Maybe the problem is that we're defining popularity too broadly. Does it matter to me if a teenage surfer chick thought an article with rumors of Britney Spears' pregnancy was really awesome? Not in the slightest. Does it matter if one of my computer geek friends really enjoyed an article on the upcoming MySQL 5 release? Yes, that does matter.

So, perhaps relevance is what people like me like. Okay, so I'll just list hundreds of people I know who are like me, get them all to use the same feed reader, and then... oh, shucks, that's never going to happen, is it?

Fortunately, it doesn't have to. We can find people like me, people I don't even know, automatically and anonymously using some clever algorithms. Put that computer to work, I say.

Great! Now we know how to sort news by relevance. We take all the news and sort by what people like me like. So, why isn't anyone doing this? Well, someone is doing it -- and doing it quite well, I might add -- but why isn't anyone else?

Well, it's hard. Really hard. Maybe I made it sound easy, but the devil is in the details. For example, the most interesting articles for a subgroup isn't actually the same as the most popular; it's a little different, and that's just one of tens of spots where you can trip up and hork the quality of the relevance rank. These "clever algorithms" I mentioned can be really expensive; doing this at scale for millions of readers requires a lot of careful thought. News is perishable -- old news is no news -- so you better find a good solution to the cold start problem. And the list goes on. It's not easy.

But it's got to be done. It takes too long and too much effort to use the current generation of feed readers. To break into the mainstream, next generation feed readers will have to sort articles by relevance.

6 comments:

Doug W said...

Ok, this hit close to home. I'm definitely tired of wading through the sheer volume of information. I immediately went over and signed up for some personalized feeds.. ya got me.

Marshall Kirkpatrick said...

I find the recomendations made by Furl.net regarding other users' feeds I should read to be the best way to fresh, tight news. I also organize my RSS aggregator by puting most feeds in bulk folders (read if I have time) and the feeds that are most important for me to know about knew content in I put outside of folders seperately, at times renaming them to begin with numbers in my prefered order. Eg. "1. Cutting Through" "2.Al Jazeera" etc.

That aggregator fiddling is pretty manual, and not based on personalization, but I think Furl's recomendations are really good.

Excited to check out your blog, btw. We've got some similar interests ourselves. Best of luck.

John said...

The lack of relevacy ranking in RSS feeds is even worse then you suggest.

A proposed RSS extension, attention.xml, will provide some degree of relevancy based on items viewed and/or items recently updated.
http://developers.technorati.com/wiki/attentionxml

Andy Harbick said...

I've got bloglines pointed at my personal feed of technology blogs. It's great. I often discover things that I would've seen eventually first on my findory feed.

Alex Bosworth said...

I'm not sure that other people's opinions of posts really has any bearing on whether I want to read them when it comes to RSS feeds.

Maybe I would like to be able to subscribe to the same feed in different ways, or through different filters, but many feeds I just want to read as they are.

Anonymous said...

Hi Greg.

You might find outbrain.com interesting. They have a FireFox extension that lets readers rate blog posts on the fly and then they have some clever algorithms to decide what's interesting, current and highly rated.

I can imagine that this, combined with clever tag or index based filtering could be the thing you're talking about.