Tuesday, June 08, 2010

Travel itineraries from Flickr photo trails

Every once in a while, you hit a paper that seems like a startup waiting to happen. A paper that will be presented next week at HT 2010 by Yahoo Research is one of these.

The paper, "Automatic Construction of Travel Itineraries using Social Breadcrumbs" (PDF), cleverly uses the data often embedded in Flickr photos (e.g. timestamp, tags, sometimes GPS) to produce trails of where people have been in their travels. Then, they combine all those past trails to generate high quality itineraries for future tourists that tell them what to see, where to go, how long to expect to spend at each sight, and how long to allow for travel times between the sights.

Some excerpts from the paper:
Shared photos can be seen as billions of geo-temporal breadcrumbs that can promisingly serve as a latent source reflecting the trips of millions of users ... [We] automatically construct travel itineraries at large scale from those breadcrumbs.

By analyzing these breadcrumbs associated with a person's photo stream, one can deduce the cities visited by a person, which Points of Interest (POI) that the person took photos at, how long that person spent at each POI, and what the transit time was between POIs visited in succession.

By aggregating such timed paths of many users, one can construct itineraries that reflect the "wisdom" of touring crowds. Each such itinerary is composed of a sequence of POIs, with recommended visit times and approximate transit times between them.

[In surveys] users perceive our automatically generated itineraries to be as good as (or even slightly better than) itineraries provided by professional tour companies.
This reminds me quite a bit of the work on using GPS trails from mobile devices like phones (e.g. [1] or [2]) or search histories on maps (e.g. [3]). But, the use of Flickr photos as the data source is clever, especially for this application where the photos are also useful in the final output and the gaps in the data stream are not important.

Fun idea, nicely implemented, and very convincing results. Definitely worth a read. Don't miss the thoughts at the end on expansions to the idea, such as changing how the trails are filtered and aggregated based on individual preferences to generate personalized itineraries.


Marin Dimitrov said...

there was a similar paper from Microsoft at WWW'2010 - Equip Tourists with Knowledge Mined from Travelogues

Greg Linden said...

Hi, Marin, The big difference is the use of billions of Flickr photos, which is the interesting idea in this paper. The paper you cited used blogs as the data source.

The thing to note in this HT2010 paper is the very clever idea of how Flickr photos can produce timestamped trails of movements, like coarse-grained GPS trails.

Steps said...

A related poster was also published at WWW'2010: Antourage: Mining Distance-Constrained Trips from Flickr.

Greg Linden said...

That WWW2010 poster does use Flickr photos, filtering for ones that have GPS tags on them, but only considers each photo in isolation; that poster does not generate timestamped trails of movements, duration of stays, and transition times.

What I think is the very clever idea in the HT2010 paper is that you can build trails of where people have been, linking each photo together to form a path, by combining the timestamp and location (either explict or derived from tags) data. It's the breadcrumbs, the trails, the history that is new, clever, and apparently quite powerful.

MattHurst said...

I was very impressed to learn the student was able to actively collaborate with the Flickr team at Yahoo n this research. Do you think this might result in a real product that they could bring to the photo sharing service?