Friday, September 08, 2006

Digg struggles with spam

There have been some interesting posts lately on how Digg is struggling with manipulation and spam.

First, Philipp Lenssen writes about "Digg vs. Groupthink" and the problems of "favoring a submission based on what ... friends favored."

Second, Alex Bosworth posts about "The Prisoner's Dilemma in Digg Story Promotion".

Third, Nick Carr focuses on an "Undiggnified" conflict in the Digg community over new rules intended to reduce manipulation and spam.

These problems with Digg were predictable. Getting to the top of Digg now guarantees a flood of traffic to the featured link. With that kind of reward on the table, people will fight to win placement by any means necessary.

It was not always this way. When Digg was just used by a small group of early adopters, there was little incentive to mess with the system. The gains from bad behavior were low, so everyone played nice.

Now that Digg is starting to attract a large mainstream audience, Digg will be fighting a long and probably losing battle against attempts to manipulate the system for personal gain.

See also my Jan 2006 post, "Digg, spam, and most popular lists".

Update: Pete Abilla also has a good post, "Digg as a game".

Update: Matt Marshall reports that a website called Spike the Vote "lets its members conspire to submit certain URLs of stories -- thereby lifting the odds those stories will get [Digg] front-page coverage."

Update: Niall Kennedy reports that there were "lots of stories about how sites can tap into the Digg's huge audience" at the PubCon Conference. Niall goes on to say that:
Some marketers create a story aimed at the Digg audience ... and with the appropriate submitters and human or bot-powered voting rise towards the top. A few search engine marketing consultants are promoting their account status and influence on Digg to clients.

There is a lot of activity in the social networking and user generated content space from marketers and spammers.
Update: Niall Kennedy again writes about spam on Digg with a specific example of how it is done:
Last weekend I noticed a Digg submission about weight loss tips had climbed the site's front page, earning a covetous position in the top 5 technology stories of the moment.

The webmaster of had inserted some Digg bait, seeded a few social bookmarking services, and waited for links and page views to roll in, creating a new node in a spam farm fueled by high-paying affiliate programs and identity collection for resale.
This is going to be a serious problem for Digg. Because everyone sees the same top stories on Digg -- because it is a simple most popular list -- the incentive to spam it is very high. Digg will be fighting a long and probably losing battle against spammers.

For more on that, see my previous posts, "Combating web spam with personalization" and "Web spam, AIRWeb, and SIGIR".


Anonymous said...

I wonder if Digg's problems are deeper than just users trying to game the system. It seems to me that in any sufficiently large system, you will have natural power law / Zipfian properties that start to emerge. In other words, there is, naturally, going to be a fat head of extremely active, extremely high profile, and thus extremely influential, diggers.. and a long tail of people who only vote once a month, or who don't vote at all and just read the page.

Isn't it entirely possible that, just because the way statistics work, a post hoc Digg "editorial board" was bound to emerge anyway? This has nothing to do with people gaming the system for personal profit or whatever. There is no maliciousness in this. But it is bound to happen.

So I just have to ask myself: Given that such systemic structure is inevitable, given the fact that a post hoc editorial board will arise no matter what, what are the advantages of Digg over, say, Slashdot? With Slashdot, you still have thousands of users submitting stories.. it's just that a "pre" hoc editorial board chooses which ones make the first page.

But the net effect, whether the editorial board is post hoc or pre hoc, is the same. Any site, even Digg, is going to have a small-ruling-class-influenced editorial bias. If you like that bias, you'll keep reading. If not, then not.

But all this rhetoric about the glorious democratizing nature of Web 2.0... I think that really needs to be examined scientifically. I think Digg's problems are entirely a natural phenomenon.

John K said...

This pattern happens over and over on the internet as Clay Shirky's most important essay predicts:

I think findory could help break it by allowing user-defined preference groups of up to size say 20.

That way I'd get articles from a group like the original digg size - people who send me stuff anyways.

Greg Linden said...

Great point, Jeremy. Wikipedia seems to suffer from this as well.

Anonymous said...

I heard an interestin IT Conversations podcast recently, about the Dunbar number. This might have some bearing on the size of social groups that a system like Digg can potentially handle.

I think this is in line with what John K (above) is talking about.

Anonymous said...

@John: that was a fantastic essay that you linked to, and certainly very relevant to what we're seeing with Digg and other sites right now even though it was written more than three years ago. Thanks for posting it here!

Anonymous said...

Frankly, from a ex-digger point of view, Digg is solely for people that :
- love macs
- love linux
- hate MS

I've been quite a digger, but, once you get the time to realise so "oriented" it is, you end to the point that its not a fair info provider :-)

And do not try to have a story make the front page : that's just not work, unless you have (many) (powerful) friends on digg !