The MyLifeBits project at Microsoft Research is an attempt to make most aspects of a person's work and life experiences searchable. MyLifeBits captures every e-mail, every web page visited, documents read, every phone call, every bit of music, every photo. They even started taking video of the researcher's daily life and making that searchable.
As you might expect and as discussed in the paper, the project is inspired by Vannevar Bush's Memex.
The paper was a fun read, but I was shocked when I discovered they initially expected users to spend time organizing and cataloguing all this information. Unsurprisingly, they found that unworkable. Some excerpts:
With large quantities of information, users are not just unwilling to classify, but are in fact unable to do it ....In the future work appendix to the paper, the authors describe how it would reduce the work required for classification and cataloguing all the accumulated data if they had automatic speech to text for audio and video and face and object recognition for images and video.
Even with convenient classifications and labels ready to apply, we are still asking the user to become a filing clerk -- manually annotating every document, email, photo, or conversation.
We have worked on improving the tools, and to a degree they work, but to provide higher coverage of the collection more must be done automatically ....
Even capture itself must be more automatic on this scale so that the user isn't forced to interrupt their normal life in order to become their own biographer.
I have to say, there is no way I would organize or tag this kind of data manually. The gigabytes of photos I have on my computer look like the "big shoebox" mentioned in the paper, and, if changing that requires any effort from me, they will never look like anything else.
In general, while many would get value from something that searched over all the data generated in their daily life, I suspect few would be willing to do substantial work to get those benefits.
In fact, I find that Google Desktop Search is approaching what I need. It already finds information about who I have e-mailed, meetings I had, documents I read, and web pages I visited.
It would be marginally more useful if it searched every phone conversation (after dealing with legal issues and improving speech to text), every photo (after improving face and object recognition), and every TV show and movie (legal issues, speech to text). But only marginally.
Having searchable video of my daily life (360 degree video 24 hours/day) might also be useful, but the privacy, legal, and technical issues there are extreme.
So, this has me wondering. How close is Google Desktop Search to the low hanging fruit, the most useful parts, of Memex?
After reading the MyLifeBits paper, I am wondering if the endpoint envisioned in that paper is really all that desirable. Perhaps we are closer than we might think to the parts of Memex we really need?
See also the Microsoft Research project, "Stuff I've Seen".
See also my Oct 2004 post, "Google Memex".