Comments on Geeking with Greg: The perils of tweaking Google by hand

The real difficulty in determining search quality ...

2007-06-04T16:59:00.000-07:00

The real difficulty in determining search quality is that it's not well defined and not easily evaluated from logs. You can notice egregious failures (user reformulates query too many times), but it's hard to detect "minor" failures or if the user has settled on a "good enough but not great" result.

Even if you can learn from failure, you'd still not want to rely on it too much for usability reasons. The feedback you get is necessarily noisy and you need a sufficient number of failures to generalize them to a pattern. Ideally your judgement should notice early warning signs *before* failures have occurred to users. (However, having said that, I do agree with jeremy/Donna Harman that automatic failure analysis is a big opportunity because so little of it is done right now, and it's more scalable.)

As for manual tuning, I'm not surprised at all. In fact, I believe Google has an army of low-paid drones that use different versions of their search engine and report them to the engineering "priesthood" to decide which hacks to put into production.

You know how a lot of infomercials always have a "doctor" testifying to the effectiveness of the product and some computer graphic that shows a "scientific simulation" of how the product works. Same thing here and NYT took the bait.

Greg, two reactions:In each query, a few of the re...

2007-06-04T09:22:00.000-07:00

Greg, two reactions:

In each query, a few of the results would be different each time. Each time, the search engine is making a prediction on the impact (usually an anticipated slight negative impact) of making this change. Wrong predictions are surprises, opportunities to learn, and are grouped with other wrong predictions until the engine can generalize and attempt a broader tweak to the algorithm.

First, I am impressed by the timeliness of your suggestions. Donna Harman (one of the principle motivating forces behind the original TREC in the early 90s, which arguably enabled the biggest improvements in IR systems, pre-Google) gave a talk at RIAO last week. And she addressed exactly this issue. And gave exactly this solution. She basically took everyone to task for not doing more of this failure analysis.. noticing and extracting the queries that fail, and then generalizing solutions. It is not just Google that is failing to do more of this; the rest of academia is fairly poor in this respect, too.

Frankly, I thought Google was beyond this. Rather than piling hack upon hack, I thought Google's relevance rank was a giant, self-optimizing system, constantly learning and testing to determine automatically what works best.

Second reaction: I am not as surprised as you are. In talking with folks from Google over the years, I have fairly consistently received the message that Google is more interested in engineering than science. In fact, at SIGIR 2004, the very same Amit above said that if you are interested in getting hired by Google, you should become a "hacker" (his word, in quotes). Of course, he meant this in typical geek fashion.. hacker as a tinkerer and engineer, rather than hacker as cracker. But with the emphasis on "hacking" rather than methodical science, it is not as surprising that improvements to the engine are "hacks" and "tweaks" rather than principled, scientifically-generalized solutions. This is not meant as a criticism, as I certainly am not always as rigorous as I should be in my own work. I am only making the observation that what they report is not so out of character as it seems, given their emphasis on engineering over science.

Just to agree with the previous comment, I wouldn'...

2007-06-03T14:28:00.000-07:00

Just to agree with the previous comment, I wouldn't draw quite the same inferences as you did, Greg.

I would say that while machine learning is important at Google, it doesn't completely remove the aspect of engineers looking to improve the system. Rather than tweaks, I'd view the efforts more as tuning the existing algorithms and looking for additional algorithms that can improve search as well. In the same way that you probably wouldn't be content with the personalization algorithm(s) from Findory and never changing them, folks at Google are looking for ways to improve their algorithms, even as the algorithms run and do many aspects of ranking/personalization automatically.

.... and it is not worth getting too over-awed eit...

2007-06-03T13:55:00.000-07:00

.... and it is not worth getting too over-awed either. There is a certain Wizard of Oz quality to all this.

After all it is in their interest to project an aura of magic and wizardry, if only to persuade (prospective) competitors that the game is hopeless. A lot of their "talent" hiring smacks of that, when you think about it.

Except that I would not take some few and simple q...

2007-06-03T13:17:00.000-07:00

Except that I would not take some few and simple quotes and generalize it to understand how google works with this stuff. Obviously, they did not reveal to the reporters the real "secret sauce". There is much they did not say. Its possible that they have something similar to what you are saying. Manual tweaks will be necessary at times.