Thursday, September 23, 2004

Humans vs. Robots == Yahoo vs. Google

JD Lasica writes about potential biases in Google News' automated selection of articles for their front page, apparently mostly due to small news sites trying to game the system.

But a fascinating sub-theme in the article is the differences between how Google News and Yahoo News select articles for their front pages. For Google News:
    Google News uses a mix of techniques to ensure that users are presented a diverse range of perspectives. The ranking and prominence of stories are based on several factors: How many publications are writing about a topic; how recent the articles are; the size of the story, with substantive pieces ranking higher than short items; and the frequency of the search term within the article. The computer algorithms, [Krishna Bharat, chief scientist for Google News] said, "are trying to understand how hot and how big the story is."

    Every 15 minutes a new edition of Google News is generated and the ranking changes. The formula rearranges the headline blurbs in each story cluster based on the freshness of each article and the importance of the source.
Yahoo News uses humans:
    A small editorial staff programs the Yahoo News front page as well as plucking out hidden gems that appear on other sites ... the factors include the source, the freshness of the story, and a method of determining relevance.

    "We use actual humans," [Jeff Birkeland, product manager for Yahoo News] added. "News is far too human of an endeavor to rely 100 percent on automation."
Yahoo's human-based approach is more typical. Most news sites seem wary of automation. CNet went a step further, mocking Google's approach with the tagline "The Web filtered by humans, not bots" on their Extra product.

But some things can only be done with automation. With over 4,500 sources, Google News has such a deep database of articles that it would be impossible for human editors to review even a small subset. Using human editors means many of these articles would never even have the potential to be featured; the depth of knowledge is wasted.

And there's more. What's the most relevant news of the day? It varies from person to person, depending on interests, career, and location. Until you personalize the news, you're still wasting the depth of knowledge in your database. So, you personalize. Now, instead of one front page, you're building millions of front pages, each with a different view into your hundreds of thousands of news articles. It's a task that's simply impossible to do with human editors.

In the end, it will be robots. It is inevitable.

Update: Yahoo News contacted me to clarify Jeff Birkeland's comment, saying that some of Yahoo's front page is programmed by hand, some is automated, and story rankings are mostly automated. This doesn't impact the thrust of my argument that total automation of the entire front page will be necessary to expose the full depth of news available, especially as personalized news becomes mainstream.

No comments: