Saturday, July 07, 2007

People often repeat web searches

Teevan et al. have a paper, "Information Re-Retrieval: Repeat Queries in Yahoo's Logs" (PDF), at the upcoming SIGIR 2007 conference.

The paper points out that people often repeat queries to re-find information they found in the past. It also shows that these queries can be predicted and, by maintaining a search history for each user, facilitated if we can surface the history at appropriate times.

Some excerpts:
People often repeat Web searches, both to find new information on topics they have previously explored and to re-find information they have seen in the past ... Our study demonstrates that as many as 40% of all queries are re-finding queries.

Re-finding appears to be an important behavior for search engines to explicitly support, and we explore how this can be done. We demonstrate that changes to search engine results can hinder re-finding, and provide a way to automatically detect repeat searches and predict repeat clicks.
One proposal in the paper is to move items up in search results that a searcher has clicked on in the past:
The data is suggestive of a positive improvement in time-to-click for positive changes in rank as well as some benefit to no change (likely due to learning). When previously clicked results move down in rank, time-to-click increases.

A hypothesis consistent with previous work on eye-tracking in search is that users pay more attention to early-ranked items. Thus, if a previously clicked on result moves up, it is more easily re-found via a visual scan.
The authors also make other suggestions on how to surface search history and how to deal with the conflict between finding new information and re-finding old:
Traditionally, search engines have focused on returning search
results without consideration of the user’s past query history, but the results of the log study suggest it might be a good idea for them to do otherwise.

Although finding and re-finding tasks may require different strategies, tools will need to seamlessly support both activities ... Because people repeat queries so frequently, search engines should assist their users by providing a means of keeping a record of individual users' search histories.

[There] may benefit from having a different amount of screen real estate devoted to displaying ... search history ... Search histories could be customized based on many factors including the time of day. Users with a large number of navigational queries may also benefit from the direct linking to the Webpage (possibly labeled with the frequent query term).

While a user may simultaneously have a finding and re-finding intent when searching, satisfying both needs may be in conflict. Finding new information means being returned the best new information, while re-finding means being returned the previously viewed information.

We found that when previously viewed search results changed to include new information, the searcher’s ability to re-find was hampered. It is important to consider how the two search modalities can be reconciled so a user can interact with new, and previously seen, information.
There are also some interesting breakdowns of the types of search queries they saw on page 3-4 of the paper in tables 1-3.

On a related note, one thing I like about this work is that it shows the value of starting to walk down the path toward search personalization.

A first and necessary step toward personalization is to start maintaining search and viewing history for each user. As Teevan et al. point out, search and viewing history has a lot of value when surfaced to users.

When companies ask me about personalization, I often recommend they start with baby steps -- maintaining and surfacing history, showing related content -- rather than trying to implement full personalization immediately. These are relatively simple to implement and need to be done well anyway before anyone can do full personalization.

Much value can be gained from the first steps on the road to personalization. Pick the low hanging fruit before trying to tackle the harder problems.

See also my previous posts ([1] [2] [3]) on some of Jaime Teevan's work on personalized search.

4 comments:

Matt McKnight said...

to me, it's a sign that people don't know how to use del.icio.us properly.

Anonymous said...

Matt:
Many people are lazy and use Google as a "GOTO" tool. It needs fewer clicks to "go to" a page through google.com than through del.icio.us (when you've many bookmarks).
Google.com supports this behavior as it knows which search result URLs I followed earlier and presents those very prominently in the results.

Anonymous said...

Greg writes: On a related note, one thing I like about this work is that it shows the value of starting to walk down the path toward search personalization.

I can agree with this. But to be the usual (good-natured) thorn-in-the-side, I think it also shows the value of starting to walk down the path toward providing search "tools". As the authors write: While a user may simultaneously have a finding and re-finding intent when searching, satisfying both needs may be in conflict.

The way around this, perhaps, is to provide a little tool, a little button, for the searcher to more clearly express his or her intent. "Show me my previously found items (ranked by relevance or by the frequency of which I have clicked them in the past)" versus "Show me only new items". With one click, the user can easily switch back and forth between both modes, after issuing a query. Such a tool is also a baby step, and not one that is too much work or effort for the lazy user.

My ongoing point here is that there is a tradeoff between search engine automation (or, the search engine doing everything for the user, i.e. personalization) and user control (the user being able to give explicit feedback in order to guide the search). The former approach is good for combatting user laziness, but too much of it and the user will feel like they've lost control. And, of course, give the user too much control, and they'll simply be overwhelmed (imagine sitting a novice computer user in front of Photoshop CS3. Too much control!) So it seems like a balance is needed, and that this Teevan paper offers arguments in favor of baby steps in both directions.

Anonymous said...

I agree with you, Jeremy. At some point, popularity of a site & whether you've found it or not needs to take a back seat to what search engines were originally designed to deliver: results containing relevant sites to a search query. It worries me to think that these new user input tools might reduce the quality of searches and limit the number of sites that receive impressions in the SERP. Hopefully Google and Co. will implement tools that will allow us to keep our "search independence."