Monday, July 17, 2006

Starting Findory: In the beginning

For the first post, I wanted to talk a bit about how I was motivated to start Findory.

The fundamental idea behind Findory is to apply personalization to information. Help people deal with information overload. Help cut through the noise and filter up the good stuff. Help people find what they need.

Personalization can do this using implicit information about your wants and needs, learning from your behavior. Personalization complements search. Search requires people to explicitly provide a query. Personalization surfaces useful information without any explicit query.

Personalization can also be used as a way to improve search. For example, if people are repeatedly refining a query (e.g. [greg linden findory], then [greg linden amazon]), they are not finding what they need. Paying attention to what they have done, especially what they have done recently, and showing different search results to different people can help surface information that otherwise might be buried deep. That is also personalization.

In addition, personalization can be used to improve advertising. Almost all advertising is useless, and annoying. Advertising does not have to be that way. Advertising can be useful information about products and services you might actually want. By paying closer attention to your interests and needs, advertising can be targeted instead of sprayed. It can be relevant and helpful, not irrelevant and wasteful. That is also personalization.

Back when I became really interested in this and wanted to work on it, I was just getting out of Stanford in 2003. Creating a company was not my first plan. I talked to Google, Yahoo, MSN, and Amazon first. The people I talked to expressed interest, but not urgency, and did not want to move as aggressively on personalization as I did.

I believed in the value of applying personalization to information. I wanted to see it happen. While I thought it was inevitable that everyone will be doing personalization of information over the next 5-10 years, I wanted to see it sooner. I started Findory to make it happen.


Gene Weng said...

Hi Greg,

Why most people are focusing on "implicit" instead of "explicit"? Why not a web site that allows user to express "I want to buy a 6-Megapixel Digital Camera at the price around $250."?

Greg Linden said...

Hi, Gene. That's a good point. That's basically a search. You lay down your constraints (e.g. "6 megapixel camera near $250") and get your results.

The problem is that people often are unwilling or unable to specify a search. Do you know which of the 5M+ items in the Amazon catalog might be of interest to you? Can you specify a search for them? Similarly, can you specify which articles in the millions of articles published daily might be interesting to you?

Perhaps, with effort, you can catch a subset of them yourself. But what of those that are not willing to put in that effort? Should we not seek to help them find what they need?

That's where the focus on implicit helps. If we pay attention to what you are doing, we can learn what you want and need. By focusing in on your past behavior, we can help you discover new things in the future.

Anonymous said...

Greg writes: "Personalization surfaces useful information without any explicit query.

First of all, let me say that I agree with the ultimate goal of what you're after. Different people should indeed be able to see different results, as influenced by their prior interactions and/or histories.

However, I continue to take a strong stand on the terminology that you use in pursuit of this goal, because (with all respect) I think it dilutes the understanding of what it is you're trying to do.

When you say above that personalization surfaces useful information without any explicit query, I have to strongly disagree. Personalization is actually orthogonal to the nature of the query.

I think this distinction between explicit and implicit queries is actually a continuum, say along the x-axis. Far, far to one side you have structured queries, ala SQL. Further down the continuum traveling left you have unstructured queries, ala ad hoc retrieval (Google). Further down the continuum, on the left, you have implicit queries (or queryless queries, or whatever you want to call them) in which information is retrieved without having to formally issue a query.

But queryless queries by themselves are not necessarily personalization. I've tried hard up to this point to avoid tooting my own employer's horn, but at FXPAL we started developing a system for contextualized information recommendation, i.e. queryless queries. The system (Pal bar) takes a look at whatever web page and/or document you are currently looking at, and automatically retrieves relevant, related information and displays it in a side bar. (We also started developing this in 2003.) In this manner you get information you otherwise would not have known about.

But even though there is no explicit query in this system, that does not mean it is a "personalized" system. This is because any other user with Pal bar installed, who then looks at the same web page, will get the same other pages recommended to them. It is a universal similarity/relevance measure that we currently used, rather than a "personalized" one.

So my point is that personalization is orthogonal to whether the query is implicit or explicit. If this implicit-explicit continuum lies along the x-axis, personalization lies along the y-axis.

You can have explicit queries that are totally not personalized (i.e. Google's current approach) or you can have explicit queries that are totally personalized (research that my colleague Fernando Diaz did in 2003 -- I think I linked to his paper here a few months ago).

By the same token, you can have implicit queries that are totally not personalized (Pal bar), or you can have implicit queries that are totally personalized (Findory).

Note that the same argument applies to contextual advertising, too. Contextual advertising is orthogonal to personalization. You can have ads that are relevant but impersonal (Google's current approach) or you can have ads that are relevant but more personal (your idea).

Finally, the same argument applies to the offering of "tools" to a searcher. I know you don't like tools, ala Ask's query expansion suggestions and/or Vivisimo's clusters. But tools themselves are orthogonal to personalization. You can have Vivisimo clusters that are universal and unpersonalized, or you could have Vivisimo clusters that are different in their clustering, based on the user's prior interaction with the system. I.e. personalized clustering.

I know I just keep arguing with you over terminology, but I think there is a highly important distinction to make...explicit vs. implicit is orthogonal to personalization.

I think it is important to make this distinction because by so doing we are much clearer about the technologies we develop, and that will increase our ability to reuse and readapt these technologies in the future. If personalization really is orthogonal to implicit/explicit, that means we can separate out personalization techniques, and apply them more easily to other realms (advertising, tools, etc.) But if personalization is the same thing as implicit queries, that means we will never even think to apply it to personalized clusters or personalized query expansion terms.

So if you think that personalization is the exact same thing as implicit querying, and that personalization and explicit queries are at opposite ends of their own continuum, I really would be interested in hearing why.

Greg Linden said...

Thanks, Jeremy, you are right that I am being sloppy in my terminology. I appreciate you keeping me honest.

Gene Weng said...

I searched "6-Megapixel Digital Camera at the price around $250" and here is what I found:

Clearly we are not there yet.

I just came to know this company this afternoon since it got $6.2 million fund. See

Anonymous said...

If you wanted to do personalization, why leave Amazon? What was it about doing setting out on your own that was more attractive?

Anonymous said...

Well, maybe Jeremy and Greg are both right, just seeing one thing from a different angle.

Actually, I think searching is searching, it is based on request; personalization should be, IMO, based on personality and desire.

For instance, given that someone was sensitive to pattern and color; passionate about painting; extremely adore Van Gogh; love Pulp Fiction very much, if we found 60% of other users with same attributes are artists or designers, we will recommend s/he to change a career, even if s/he is currently a programmer or policeman. If s/he can't take the risk although original job was not fit for her/him very well, OK, at least we can try to sell the user a brush-pencil.

By the way, Greg, have you try to analyze human sentiment instead of behavior?