Sunday, January 24, 2010

Hybrid, not artificial, intelligence

Google VP Alfred Spector gave a talk last week at University of Washington Computer Science on "Research at Google". Archived video is available.

What was unusual about Al's talk was his focus on cooperation between computers and humans to allow both to solve harder problems than they might be able to otherwise.

Starting at 8:30 in the talk, Al describes this as a "virtuous cycle" of improvement using people's interactions with an application, allowing optimizations and features like like learning to rank, personalization, and recommendations that might not be possible otherwise.

Later, around 33:20, he elaborates, saying we need "hybrid, not artificial, intelligence." Al explains, "It sure seems a lot easier ... when computers aren't trying to replace people but to help us in what we do. Seems like an easier problem .... [to] extend the capabilities of people."

Al goes on to say the most progress on very challenging problems (e.g. image recognition, voice-to-text, personalized education) will come from combining several independent, massive data sets with a feedback loop from people interacting with the system. It is an "increasingly fluid partnership between people and computation" that will help both solve problems neither could solve on their own.

This being a Google Research talk, there was much else covered, including the usual list of research papers out of Google, solicitation of students and faculty, pumping of Google as the best place to access big data and do research on big data, and a list of research challenges. The most interesting of the research challenges were robust, high performance, transparent data migration in response to load in massive clusters, ultra-low power computing (e.g. powered only by ambient light), personalized education where computers learn and model the needs of their students, and getting outside researchers access to the big data they need to help build hybrid, not artificial, intelligence.


Bud Gibson said...

This is all very reminiscent of Luis Von Ahn's work which was highlighted in a Google tech talk 2.5 years ago. He's got big funding now and was the force behind the google image labeler.

I think the most interesting thing about Von Ahn is how he crafted this into a dissertation. I think the fact that Von Ahn was able to state computational bounds on the problem of human computer integration and demonstrate convergence for some inference problems accounts for the big interest at Google.

jeremy said...

with a feedback loop from people interacting with the system.

Sure. Haven't we known this for decades? Take relevance feedback for ad hoc retrieval, for example. That's almost 40 years old.

Or is the focus at Google research more on implicit feedback, rather than explicit feedback? Seems like most of that they do is the former. I think the biggest gains will come about from the latter.

Greg Linden said...

Hi, Jeremy. Absolutely, as with a lot of things Googly, the idea itself is not novel. The novelty is in the scale.

Paper after paper out of Google is taking an existing approach and applying it to data sets of extraordinary magnitude. The lesson is in how well that works, over and over again.

jeremy said...

Yes but.. again my question: how is the scale being applied?

Batch processing of massive data set of aggregated human actions is hardly extending my capabilities when working on a problem. It might give slightly better recommendations, or slightly better precision@3. But what about precision@75? What about precision after 7 rounds of interaction, when I've gone on a pathway of exploration that is exponentially different from any pathway that anyone has ever taken before?

But where's the hybridicity? Where's the interaction? Where's the true personalization?

I want personalization that doesn't just run a statistical classifier over my actions, and figure out what batch of like-minded people to throw me in with. That kind of personalization is what I would call "wide but not deep".

I want personalization that actually reacts to what I am doing, and changes as I go, in the moment, online. Deep. Not wide.

It'll still take scale to do this for me. But it's a different kind of scale.

Nick said...

it sounds like Kasparov's "Advanced chess", where chessplayers are assisted by computers (or computers are assisted by chessplayers).

jeremy said...

@Nick Yes, exactly. With the chess example, the scale is deep rather than wide. The user is interactively choosing a path through the chess game that is (very likely) different than any other path that any other human or machine has taken before. So the machine, rather than aggregating large-scale data from millions of other chess users to make the decision, is going deep on that one user's interactions, and using mass compute power (scale) to augment just that individual's current game play.

The player then interactively instructs and corrects the backend "AI", and together they form an evolving, online hybrid.

This is not what Google/Amazon/etc. does. It's the opposite. These companies aggregate large data sets of user interactions/logs to come up with single point interactions that are more precise. People who bought this also bought that. People who clicked this also clicked that. People who used this query also used that query, or clicked that result. The more large scale data you have, the more effective you can make those point-wise interactions. Again, wide not deep.

But what if I've got an information need where I issue 4 queries. And on each query, I pick 3 documents from the top 10 that best exemplify the information I am seeking. (This is not an unrealistic number if I am shopping or planning a trip.)

And suppose furthermore that my needs/picks are going to be different from your needs, either because of my experience or geography or previously purchased items or cultural background or whatever. So I end up picking a different 3-sized subset of the top 10 for each of the 4 queries.

So for each query, there are (10 choose 3) = 120 different possibilities. And since there are 4 queries in a row, we have 120^4 = 207 million different paths through the data, for just that sequence of 4 queries. Add a fifth query and we're well into the 24.9 trillion range. And that's just for one particular information need.

Is there ever going to be enough training data for this? I can't imagine that there will.

What this says to me is that we need methods for better user interation, better online feedback and dialogue, to truly customize the user's experience. Instead of trying to give the user the best possible answer by batch training on historical log data, we should instead be building systems that allow the user to give transparent, real time feedback. To guide and to steer the algorithm, the same way these chess aficionados are guiding chess algorithms.

jeremy said...

Decided to blog about it, fwiw: