Tuesday, August 02, 2005

Flickr, image search, and PhotoRank

Photo search can be a challenge. Text documents contain a lot of easily analyzable information about the content of the document. Not so with photos.

From what I've seen, there's three approaches to dealing with this problem. One is to use the text around the images, such as with an HTML document that has photos embedded in it. This is the technique used by the image searches provided by Google, Yahoo, and the other search giants.

Another is to have users provide labels for all the pictures. Flickr (which was recently acquired by Yahoo) is the best known and most successful of these. Users at Flickr "tag" photos with short keywords. But there are others, including the cute little ESP game by Luis von Ahn and others researchers at CMU.

A third approach is to analyze the images themselves to identify characteristics of the images and transfer descriptions and labels from similar images. To my knowledge, most of this work is still in the research stage.

Once you have data about the images, you can search, but there's still the question of how to order the search results, what you do for relevance rank. For images embedded in web documents, you might be able to use the PageRank of the document, but Flickr and other photo services have no web documents associated with the photos.

Stewart Butterfield at Flickr just announced a new feature he calls "interestingness" and John Battelle calls "PhotoRank". It sounds like it cleverly uses the data from the Flickr community to do relevance rank:
Interestingness is a ranking algorithm based on user behavior around the photos taking into account some obvious things like how many users add the photo to their favorites and some subtle things like the relationship between the person who uploaded the photo and the people who are commenting (plus a whole bunch of secret sauce).
Mmm... Secret sauce. Seriously, I'd love to hear the details behind this. Doing relevance rank for photos like this is a hard problem. It sounds like Flickr has a great idea for how to help people search for interesting photos.

Update: Brian Dennis posts some thoughts on "interestingness" and links to several discussion threads on the feature.


Alex said...

Hi Greg,

thanks for this post about visual search and indexing. It's always good to discuss this topic, which quite rare!

I agree with most of it, especially with the importance of social tagging as a way to index he content of images.

Still, I have to disagree with one of your positions: image analysis technologies is not completely in the labs any longer.

I work with LTU technologies (www.ltutech.com). I started the company with 2 friends and fellow researchers in 1999. We do image search, classification and recognition. We have a number of clients, that have been succesfully using our software for more than 2 years now. They're very specialized people (law enforcement, defense, IP protection, brand protection,...) but the images we process for them are standard personal photographs, as well as any image downloaded from the web. And it does work and help some people.

Maybe we'll address the consumer market too one day!

Feel free to check our website for more info.

And don't think I'm just advertising! I would love to cite other companies that do the same, but none of them (vima tech, pixlogic,...) have clients or a significant track record...

But this market is still nascent and should really boom in 3 to 5 years. Wait and see!

Venkatesh said...

Came across your post searching for Image ranking schemes. The other day you had posted about "happy searcher" white paper and it made me think about a alternate scheme for image ranking ? Do you not think maturing image content regognition and image recognition schemes can improve image searches vastly ? Some of my thoughts here http://amanthan.blogspot.com/2006/02/image-ranking-alogrithm.html would appreciate you comments