Friday, November 04, 2005

Amazon Mechanical Turk?

This has to be the strangest thing I've seen in a while.

Amazon is apparently behind the site mturk.com which calls itself "Amazon Mechanical Turk: Artificial Artificial Intelligence".

According to part of their FAQ:
Amazon Mechanical Turk provides a web services API for computers to integrate "artificial, artificial intelligence" directly into their processing by making requests of humans.

A network of humans fuels this artificial, artificial intelligence by coming to the web site, searching for and completing tasks, and receiving payment for their work.

For software developers, the Amazon Mechanical Turk web service solves the problem of building applications that until now have not worked well because they lack human intelligence. Humans are much more effective than computers at solving some types of problems, like finding specific objects in pictures, evaluating beauty, or translating text.

For businesses and entrepreneurs who want tasks completed, the Amazon Mechanical Turk web service solves the problem of getting work done in a cost-effective manner by people who have the skill to do the work.

For people who want to earn money in their spare time, the Amazon Mechanical Turk web site solves the problem of finding work that they can do wherever and whenever they want.
For those that doubt that Amazon would do something this... umm... innovative, a quick view of the page source shows that many of the images and links are served from Amazon.com. This really does appear to be Amazon.

I really don't know what to say. I have a hard time seeing how this idea can succeed.

Google Answers works because the fees are high, answers quite complex, and experts well vetted. The core idea behind Amazon's Mechanical Turk seems to be to take the success of Google Answers and try to scale it up by a few orders of magnitude.

But there's problems with that. If I scale up by doing cheaper answers, I won't be able to filter experts as carefully, and quality of the answers will be low. Many of the answers will be utter crap, just made up, quick bluffs in an attempt to earn money from little or no work. How will they deal with this?

It seems to me that Amazon has just changed the problem from finding the answer to the problem from the data available to digging out the correct answer from all the crappy answers provided. Filtering crap out of user generated content at large scale is a difficult problem too.

More comments and discussion at Metafilter, TechDirt, Google Blogoscoped, Rob Hof, Jason Fried, Greg Yardley, and Slashdot.

Update: Don't miss the ongoing discussion in the comments to this post.

Update: I have a short quote in a Seattle PI article by Kristen Bolt on Amazon Mechanical Turk.

12 comments:

Anonymous said...

I think that it takes a while for all of us to understand the full impact of this technology. They've provided a web service interface to human labor. It's not necessarily humans asking humans other questions. It can be computers asking humans questions.

Imagine you are writing software that recognizes faces. It works pretty well but sometimes it gets confused. When that happens it asks for help. Instead of popping up a window and waiting for a local operator to help out it just sends out a web service request and waits a few minutes for a reply.

I think software like that is just the first step towards AI. You use anonymous humans to do stuff that the computer isn't very good at.

Anonymous said...

That last comment wasn't very clear (I guess I'm still asleep). What I meant to say was that one of the problems with AI right now is that it can't ask questions like a child or interact with the real world. But with this software it can.

Greg Linden said...

Thanks, Guido. Great explanation. It is compelling for some applications.

However, you've subtly made the assumption that the anonymous humans are always honest and correct.

They won't be. Given that money is involved, I think you're going to find these pesky humans are pretty clever at doing things that might be in their self-interest, but not in the interest of people trying to get correct answers.

Anonymous said...

To say nothing of the ethics of computers outsourcing tasks to humans. Did we learn nothing from Terminator 2?

Anonymous said...

I think you have this all wrong. It's brilliant. I can have people, for less than minimum wage, perform mind numbing tasks such as the one listed to make automotive titles more human friendly.

It takes the brilliance of the SETI projects distributed computing and applies it to a low wage resource pool. Fantastic idea.

Anonymous said...

As far as people getting it right. The SETI project also solved this problem. Send the same work packet out multiple times.

Run some trends on this and eliminate the bad results keeping the only the good.

chad said...

One question, why is the web service part so interesting? Its actually pretty hard to think of a useful programmatic system that could use an web services API call that takes a few hours to return the data. Users tend to like faster performance than that.

Maybe for offline processing tasks...but its unclear to me why those types of things would be a proper web service - seems like systems such as ScriptLance that provide a skills marketplace are more reasonable to fulfill this kind of demand.

Maybe this could be a cheap BPO marketplace but overall I'm a little unclear why this will be really useful beyond that.

Costas said...

I think it's an interesting concept. Imagine for example that Findory (or Memigo ;-) have trouble classifying an article automatically. If the article exceeds some popularity threshold (that makes it worthwhile to do something about it) it may make sense to ask a human to classify it manually.

As for incompetent or malicious responses, treating them isn't much different than treating malicious ratings on Amazon, Netflix, or wherever: have multiple humans do the same task many times, and gradually build trust metric for each human ...drone.

Greg Linden said...

Hi, Costas and Elroy. Good points on ways to deal with incorrect answers.

It's true that there are ways to increase quality, but they are expensive and imperfect.

If I duplicate effort or pay people to check over other people's work, I now take a single piece of work and send it out to N people instead of just one, multiplying the costs.

If I have an eBay-like reputation system, I increase the costs for the buyer, who now has to spend time checking reputation, trying to verify the quality of the answer (which may be difficult for the buyer), and leaving feedback. This works fine on Google Answers and eBay because they don't sell a lot of items under $5. It might not work as well on cheaper items, since the cost of the time spent on verification could approach or exceed the cost of the answer.

Finally, there's the issue of what quality you achieve after all this. 90% correct answers perhaps with all the noise, ambiguity, and cheating? Is that good enough? Considering speed, accuracy, and cost, is that competitive with using an automated solution?

I'm not saying Amazon Mechanical Turk could never work for any problem; Google Answers does this in some form for some problems. I'm also not saying there aren't ways to improve the quality of the answers provided.

I do think it is complicated and expensive to scale this up and keep the quality high. I also think, for many problems, it might be nearly as hard to dig the correct answer out of all the answers provided as it would have been to try to solve the original problem.

Anonymous said...

While it is true that reputation-based selection of workers is expensive, a quick read of the FAQ indicates that a requester can make a reputation score a requirement for working on his/her tasks.

It also looks like requesters can create additional qualifications that have to be met in order to do some work, although i couldn't find any work that had any requirements.

Costas said...

Well, I think the model here is (or should be!) less like Google Answers or eBay and more like an actual market: i.e. responders should get paid according to their trust metric, not just the number of HITs being done.

This can be done after a big enough pool of humans is built up by offering some secondary reward scheme that differentiates the price enough without alienating new drones. But initially, yes, Greg's right, the price would be higher than optimal as the pool would need to be initialized: at this point it's just doing outsourcing work for Amazon itself, so really it's near-free for Amazon. But with enough high-trust humans in the pool, repetition (and thus end-price) will come down.

Now if only could Amazon could build a better interface to the mTurks... hmm, maybe a plug to the back of the scull would work better ;-)

Anonymous said...

Seems like another case of amazon thinking that because a feature is really useful to amazon that it's equally of interest/use for others. A thread on slashdot pointed out that unless the prices go up, it's a big waste of time for anyone but the chronically unemployed in America to do these tasks ($3/hour, max, woohoo!).

Personally, I think Internet usage is nearing a saturation point, much like other mediums (people only watch a certain amount of TV, go to movies #times/week, etc.). People simply aren't going to spend 24 hours on the Internet. They use it for work, they use it for email, shopping, etc. But they're not looking for ways to spend more hours to make $3/hour doing things that benefit others. And now that we're at that point, companies will have to compete very hard for eyeballs as competition among sites and activities increase.

I think Amazon (like Microsoft) thinks they can win that war of eyeballs simply by doing "new stuff" or "more stuff". That by sheer dint of their size, people will stay loyal to them and try new things that they produce.

I wonder if this an example of Amazon "jumping the shark"?