Thursday, September 23, 2004

Humans vs. Robots == Yahoo vs. Google

JD Lasica writes about potential biases in Google News' automated selection of articles for their front page, apparently mostly due to small news sites trying to game the system.

But a fascinating sub-theme in the article is the differences between how Google News and Yahoo News select articles for their front pages. For Google News:
    Google News uses a mix of techniques to ensure that users are presented a diverse range of perspectives. The ranking and prominence of stories are based on several factors: How many publications are writing about a topic; how recent the articles are; the size of the story, with substantive pieces ranking higher than short items; and the frequency of the search term within the article. The computer algorithms, [Krishna Bharat, chief scientist for Google News] said, "are trying to understand how hot and how big the story is."

    Every 15 minutes a new edition of Google News is generated and the ranking changes. The formula rearranges the headline blurbs in each story cluster based on the freshness of each article and the importance of the source.
Yahoo News uses humans:
    A small editorial staff programs the Yahoo News front page as well as plucking out hidden gems that appear on other sites ... the factors include the source, the freshness of the story, and a method of determining relevance.

    "We use actual humans," [Jeff Birkeland, product manager for Yahoo News] added. "News is far too human of an endeavor to rely 100 percent on automation."
Yahoo's human-based approach is more typical. Most news sites seem wary of automation. CNet went a step further, mocking Google's approach with the tagline "The Web filtered by humans, not bots" on their Extra product.

But some things can only be done with automation. With over 4,500 sources, Google News has such a deep database of articles that it would be impossible for human editors to review even a small subset. Using human editors means many of these articles would never even have the potential to be featured; the depth of knowledge is wasted.

And there's more. What's the most relevant news of the day? It varies from person to person, depending on interests, career, and location. Until you personalize the news, you're still wasting the depth of knowledge in your database. So, you personalize. Now, instead of one front page, you're building millions of front pages, each with a different view into your hundreds of thousands of news articles. It's a task that's simply impossible to do with human editors.

In the end, it will be robots. It is inevitable.

Update: Yahoo News contacted me to clarify Jeff Birkeland's comment, saying that some of Yahoo's front page is programmed by hand, some is automated, and story rankings are mostly automated. This doesn't impact the thrust of my argument that total automation of the entire front page will be necessary to expose the full depth of news available, especially as personalized news becomes mainstream.

1 comment:

Jeff Boulter said...

Yahoo! News is not only human-edited. It uses a combination of humans and robots. As a matter of fact, some parts of the site are human-edited during some parts of the day and automated at other times. Saying that Yahoo! News is wary of automatation is completely innaccurate.

There are thousands of new articles on Yahoo! News and everyday obviously no one has time to go over all of them. 100% of the articles on the site are published without human intervention. Some of the articles featured are picked by Yahoo's editorial staff, but others are picked by editors at feed providers' desks.

Still other parts use 'bots' to filter, such as some RSS feeds and slideshows.

Until we have a computer that can pass the equivalent of a Turing test for news, some combination of humans and robots will be necessary.