Monday, January 09, 2006

Summing collective ignorance

A month ago, when I was talking about Yahoo Answers, I wrote:
A popularity contest isn't the best way of getting to the truth.

People don't know what they don't know. Majority vote doesn't work if people don't have the information they need to have an informed opinion.
Today, I saw Nathan Torkington's post on O'Reilly Radar, "Digging the Madness of the Crowds":
Steve Mallett, O'Reilly Network editor and blogger, was very publicly accused, via a Digg story, of stealing Digg's CSS pages. The story was voted up rapidly and made the homepage, acquiring thousands of diggs (thumbs-up) from the Digg community along the way. There was only one problem: Steve didn't steal Digg's CSS pages.
Take a majority vote from people who don't know the answer, and you're not going to get the right answer. Summing collective ignorance isn't going to create wisdom.

See also my previous post, "Digg, spam, and most popular lists".

On a side note, the O'Reilly Radar story mentions a site called Pligg. Like de.lirio.us for del.icio.us, Pligg is a free open source clone of Digg.

Update: There is a good discussion going on in the comments to this post.

Update: Yahoo Answers PM Yumio Saneyoshi and Yahoo My Web Community Manager Matt Stevens dropped by and left comments on this post. Well worth reading their thoughts.

9 comments:

Costas said...

Although I agree with you Greg, you may find this Economist article interesting. Basically given little information people will come to the correct decision more often than you may think...

Tejaswi said...

Isn't the same problem plaguing Wikipedia? How do we know anything on Wikipedia is authentic.

Their claim that popularity means authenticity, and if blatantly wrong information is published, and is a part of a rumor, or a chinese whisper, or is subtle enough not to be corrected, it will stay forever.

This is a social science problem that needs to be tackled at multipled fronts.

Anonymous said...

Greg,

This isn't really true. In fact, ignorant people can make good collective decisions because their errors cancel out. The problem isn't that the wisdom of crowds doesn't work, it's that Digg doesn't reflect the wisdom of crowds as described by James Surowiecki. More here:

Digg and the So-Called Wisdom of Mobs

Greg Linden said...

Thanks for that link, Pete. Wisdom of the mob. Heh, heh.

I'm essentially trying to make the same point as you, though you may have made it more clearly than I. Majority vote may extract the wisdom of the crowd, or it may only extract the groupthink of the mob. Hiding the votes before people have voted might help a little, but I think there has to be a better, more reliable ways to extract the wisdom from the crowd.

I think what you want to do is attempt to identify experts in the crowd, people with the necessary information to make the decision, and weight their opinions more heavily. Slashdot discussions, Amazon customer reviews, many sites have early attempts at this, but obviously a lot more work needs to be done.

Thanks, Costas, I did see that Economist article. It was good. If you liked that and want to dig in deeper, you might be interested in David Heckerman's excellent tutorial on Bayesian networks. Fun stuff.

Anonymous said...

Greg,

You've missed the point. The *whole point* of the wisdom of crowds is that the collective wisdom of a group can be *as good or better* than an expert or team of experts. For instance, instead of hiring consultants ("experts" by another name), companies should set up internal prediction markets so their employees can vote on issues relevant to the company. Picking out "experts" is the last thing that Digg should do - this may actually lead to a less accurate/truthful result. Instead, they need to remove the groupthink from the system.

One last point: is it actually a good thing for Digg if the stories are accurate and truthful? If the community cannot see how others are voting (necessary to prevent groupthink), then the community aspect becomes weaker. So while these measures might limit groupthink, it seems to me that it's actually a shared sense of identity that holds the Digg community together. To put it another way, Digg is more likely to survive precisely because of the inherent groupthink.

More at wikipedia.

Greg Linden said...

Pete, I think we're agreeing here.

I shouldn't have used the word "expert". That sounds like I'm referring to one person in the crowd. I meant making some effort to isolate the people in the crowd with an informed opinion.

Internal prediction markets work for exactly this reason. If you go to the Iowa Electronic Market and plop down cash, you likely have a good reason to be asserting that X is true. There is a cost to making a bet, so only people with informed opinions make bets.

Great point on how groupthink might help Digg succeed. There certainly are news outlets that have benefited from creating a strong community with groupthink.

Anonymous said...

Greg,

Understood. So perhaps "editor" is a better word than "expert". Better still: "trusted people". I'm not opposed to subscribing to "trusted people", rather than topics. In fact, if you subscribe to a member of del.icio.us you generally get better results than subscribing to a tag - what's more, it's spam free.

But there's a tendency (risk?) that we're just trying to reintroduce the old-media hierarchies into this new "democratic" system. If there are to be editors, it needs to be the case that anyone can be an editor, and users can subscribe to whichever editor they choose (which might be what you're suggesting). If there are a limited number of editors, then you re-introduce the problems that new-media is trying to solve (ie. news is dictated from above by a select number of people). In fact, the word "editor" is so strongly associated with hierarchies that I'd prefer to avoid it altogether.

Like I say, I'm not suggesting that Digg should necessarily change - it may indeed be the ideal system if the aim is to foster a close community. But if you wanted to create a new system that was less prone to groupthink AND avoid the introduction of traditional hierarchies, you might consider some of these ideas.

PS. As you probably know, monetized prediction markets are more accurate, but setting one up means tackling all kinds of legal issues - it's something I've been considering for a while. I spoke a bit about this in my post on Smarkets.

Greg Linden said...

Hi, Yumio. Good to hear from you!

It's true that the cost of posting an answer is fairly low and the benefit of posting false answers is negligible, but there is also little benefit from posting correct answers, especially if figuring out the correct answer requires any work.

It gets back to the filter you want. Prediction markets work because they impose costs and benefits that filter for people with information.

The current filter on Yahoo Answers will likely filter out experts (since their time is valuable) and well written, correct answers (since those take more time to do). Yahoo Answers likely will favor bored people who quickly throw down an answer to a question without much thought or effort.

More on that in my previous post, "Yahoo Answers and the wisdom of the crowd".

zby said...

My understanding of the Wisdom of Crowds (I need to add I did not read the book - this is based on what I've read about it on the internet) is that it is good for things where the errors conceal each other (like predicting the number of stones in a bowl - the underestimations conceal the overestimations). This is well based in the probabilistic theory. But does it work in other situations?