tag:blogger.com,1999:blog-6569681.post117113833947343601..comments2024-03-24T10:38:16.997-07:00Comments on Geeking with Greg: Better understanding through big dataGreg Lindenhttp://www.blogger.com/profile/09216403000599463072noreply@blogger.comBlogger9125tag:blogger.com,1999:blog-6569681.post-88032747935579335952007-02-18T17:08:00.000-08:002007-02-18T17:08:00.000-08:00No, it is not inconsistent at all, to be both skep...No, it is not inconsistent at all, to be both skeptical and enthusiastic at the same time. Hype totally turns me off, too. I just didn't hear any enthusiasm at all in Danny's comments. But maybe that's just the way I read it.<BR/><BR/>In fact, I had another post in this thread a few days ago that did not go through, in which I confessed some hypocrisy in this whole matter. I called for more enthusiasm amidst all the skepticism for NLP, but in all my other posts on your blog I express skepticism without enthusiasm for personalization. I need to be more simultaneously enthusiastic about personalization, too :-) <BR/><BR/>I am just feeling like there is a little too much hype with personalization right now, too. I have seen it tried (in the form of things like user modeling, etc.) over and over, in academic papers, throughout the past 10-15 years to little or no avail. And I have yet to see anything convincing from Google yet, other than artificial examples about Miami dolphins versus oceanic dolphins, which is really not a personalization issue.<BR/><BR/>So I think we are each just reacting to hype sore spots, in one form or another.Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-6569681.post-76697403555703135482007-02-14T18:42:00.000-08:002007-02-14T18:42:00.000-08:00Hey, Jeremy. I read Danny's comments a little dif...Hey, Jeremy. I read Danny's comments a little differently, I think.<BR/><BR/>I think he was reacting to the hype, saying that progress on NLP is likely to be slow and that PR claims of miraculous progress should be met with skepticism.<BR/><BR/>On Danny's concern that people will only enter a couple words, I think he is talking about the short-term and the difficulty of changing behavior. That says more about Powerset's likelihood of immediately "changing the very nature of search" than of the long term prospects for major innovation in search because of NLP advances.<BR/> <BR/>Personally, I see NLP as one of the most promising paths for making substantial progress in search. Yet, as Danny pointed out, the field is littered with the remains of those who made triumphant claims about NLP, and we may want to be careful of believing those who have not yet proven their promises.<BR/><BR/>I don't think it is inconsistent to be skeptical of the PR while enthusiastic about the prospects, is it?Greg Lindenhttps://www.blogger.com/profile/09216403000599463072noreply@blogger.comtag:blogger.com,1999:blog-6569681.post-35603200534171117402007-02-14T15:00:00.000-08:002007-02-14T15:00:00.000-08:00Mmm.. yes. I am totally sympathetic to hype-sensi...Mmm.. yes. I am totally sympathetic to hype-sensitivity. I do not like it, either, and am not trying to defend it.<BR/><BR/>But there are many others out there, not yourself but others, that are reacting strictly against the Powerset message, rather than against the hype. Danny's comments and critiques, for example, are against the very notion of NLP search, saying that users won't type more than 2-3 words. Where is NLP needed in the query "beach", he asks. <BR/><BR/>With no criticism of Danny, personally, it is that sort of thinking that I wish we could get away from. It just feels like, "why build an airplane? No person has ever flown before, so what would we need an airplane for?" Know what I mean?Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-6569681.post-87067300997638849942007-02-14T10:04:00.000-08:002007-02-14T10:04:00.000-08:00Hi, Jeremy. I agree that we should actively encou...Hi, Jeremy. I agree that we should actively encourage and cheer search companies and research projects that challenge the status quo.<BR/><BR/>My problem is more with the <A HREF="http://news.google.com/news?q=powerset" REL="nofollow">absurd press hype</A> that Powerset has encouraged. The product is <A HREF="http://en.wikipedia.org/wiki/Vaporware" REL="nofollow">vaporware</A>, so pronouncements of "changing the very nature of search" or "unprecedented consumer search capabilities" seem questionable. The entire thing feels similar to me to when Riya overpromised and underdelivered on their facial recognition technology.<BR/><BR/>I share your excitement and enthusiasm over the promise of NLP in search. I agree that it would be fantastic if Powerset can deliver on the claims they have made.Greg Lindenhttps://www.blogger.com/profile/09216403000599463072noreply@blogger.comtag:blogger.com,1999:blog-6569681.post-3856270059169149692007-02-14T08:37:00.000-08:002007-02-14T08:37:00.000-08:00I do not see why the two approaches are so incompa...I do not see why the two approaches are so incompatible. Why not apply NLP to the dataset, first, and then compute and utilize statistics of the recognized grammatical forms? Think of NLP as just "smart" pre-processing. Smart pre-processing + big data has got to work better than big data alone. Smart pre-processing makes big data more robust.<BR/><BR/>Despite the current media hype around Powerset, I am very hesitant to dismiss them immediately and outright, as so much of the Google-fanbased blogosphere has done. Powerset may never "kill" Google, but they do have the potential to shake up some of our collective, previously held biases.<BR/><BR/>It may be time for a little history lesson. We all tend to think that Google invented hyperlink analysis for the purpose of improving ad hoc retrieval results. But remember that Eugene Garfield first used this concept in the 1970s. He treated citations in scientific papers as "hyperlinks". Papers that received a lot of incoming citation "links" ("EugeneRank" or "Garfield-Juice") were ranked higher than other papers with equally keyword matches, but fewer incoming citations.<BR/><BR/>However, this work was not widely adopted or furthered until 20 years later. The community as a whole did not see the value in this work until Google proved it on a mass scale. And I am sure that, in that 20-year time, there were skeptics like Danny Sullivan and Paul Kedrosky saying things like "for twenty years now we know about citation link analysis, and I have yet to see it be used in any mass, proven manner" and "citation link analysis won't work; people want information, not popularity contests". <BR/><BR/>So when I see Danny dismissing natural language search because people only type in queries such as "beach" and "heide klum", or because he has been listening to the hype for 10 years, I become skeptical of the skeptic. Link analysis was poo-pooed (or at least ignored) for 20 years before it launched a $150 billion company. <BR/><BR/>Powerset still might not succeed. But it is a sad day when our first reaction is to dismiss the attempt, or to say "it has been tried before, therefore it will never work", or to say "people will never search for information in any other manner than typing in 1-3 keywords, so we should never try and offer anything different".<BR/><BR/>Some folks are taking a more tempered approach, and not saying that they are against Powerset, but instead adopting a "wait and see" attitude. I guess that is a little better, but personally I think our attitude should be more generous than "wait and see". I think we should be actively encouraging and cheering any and all search companies, Google included when it happens, that are trying to challenge the status quo. We are at such an early stage in search, even though it has been around for 40 years offline and 10 years online, that we should be going out of our way to actively encourage this sort of work.<BR/><BR/>The blogosphere should be abuzz with ideas and chatter about how we can take concepts and lessons learned from NLP and actually create a better search engine, instead of dismissing or "waiting and seeing" anyone that tries.Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-6569681.post-12571840344713862542007-02-13T00:12:00.000-08:002007-02-13T00:12:00.000-08:00Hi, Pranav. Thanks, that is flattering, but I am ...Hi, Pranav. Thanks, that is flattering, but I am not sure I know enough about NLP to be confident about my ability to do that. Moreover, many of these projects -- Powerset and Cyc in particular -- are not generally accessible to the public, so they are hard to evaluate. Sorry that I cannot be more helpful on this.Greg Lindenhttps://www.blogger.com/profile/09216403000599463072noreply@blogger.comtag:blogger.com,1999:blog-6569681.post-1171329522716821852007-02-12T17:18:00.000-08:002007-02-12T17:18:00.000-08:00Greg,It'd be great if you could write your thought...Greg,<BR/><BR/>It'd be great if you could write your thoughts describing, comparing and constrasting some of the major natural language projects out there - Oren Etzioni's KnowItAll, Doug Lenat's Cyc, Powerset, and other notable ones that I might have missed.<BR/><BR/>I know about them, I understand at a shallow level what they do, but I don't know where philosophically all these people stand (for example, Peter Norvig favors statistical learning over linguistic/semantic understanding), and how these products stack up in terms of NLP evolution.<BR/><BR/>I am sure such a post would be interesting to a number of your readers.Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-6569681.post-1171158151192271022007-02-10T17:42:00.000-08:002007-02-10T17:42:00.000-08:00Just that they are getting so much press with vapo...Just that they are getting so much press with vaporware. Reminds me of Riya.<BR/><BR/>Maybe their product really will be a revolution in NLP when it launches, but it is hard to believe.<BR/><BR/><A HREF="http://paul.kedrosky.com/archives/2007/02/09/color_me_a_powe.html" REL="nofollow">Paul Kedrosky</A> and <A HREF="http://searchengineland.com/070209-093707.php" REL="nofollow">Danny Sullivan</A> also express a lot of skepticism about PowerSet's claims if you are interested in taking a look at their thoughts.Greg Lindenhttps://www.blogger.com/profile/09216403000599463072noreply@blogger.comtag:blogger.com,1999:blog-6569681.post-1171148360665792132007-02-10T14:59:00.000-08:002007-02-10T14:59:00.000-08:00What have you got against PowerSet? That a little ...What have you got against PowerSet? That a little startup is challenging the status quo?Anonymousnoreply@blogger.com