Wednesday, December 06, 2006

Spam is ruining Digg

Elinor Mills at CNet writes about "The big Digg rig", saying:
Dubious Internet marketers are planting stories, paying people to promote items, and otherwise trying to manipulate rankings on Digg and other so-called social-media sites like Reddit and Delicious.

Some marketers offer "content generation services," where they sell stories to Web sites for the sole purpose of getting them submitted to Digg and other sites.

Companies charge as much as $15,000 to get content up on Digg, said [ACS CTO] Neil Patel ... If a story becomes popular on Digg and generates links back to a marketer's Web site, that site may rise in search engine results and will not have to spend money on search advertising, he said.

Another way to get Web links to a suspicious site is to get inside help from users at a social-media site. For instance, spammers have tried to infiltrate Digg to build up reputations and promote stories for marketers, experts say.

Other scammers are trying other ways to buy votes. A site dubbed "User/Submitter," purports to pay people 50 cents for digging three stories and charges $20 for each story submitted to the site, plus $1 for every vote it gets. The Spike the Vote Web site boasts that it is a "bulletproof way to cheat Digg" and offers a point system for Digg users to submit and dig stories. And Friendly Vote bills itself as an "online resource for Web masters" to improve their marketing on sites like Digg and Delicious.
See also Niall Kennedy's recent post, "The spam farms of the social web".

See also my Sept 2006 post, "Digg struggles with spam", where I said:
These problems with Digg were predictable. Getting to the top of Digg now guarantees a flood of traffic to the featured link. With that kind of reward on the table, people will fight to win placement by any means necessary.

It was not always this way. When Digg was just used by a small group of early adopters, there was little incentive to mess with the system. The gains from bad behavior were low, so everyone played nice.

Now that Digg is starting to attract a large mainstream audience, Digg will be fighting a long and probably losing battle against attempts to manipulate the system for personal gain.
See also my July 2006 post, "Combating web spam with personalization".

See also my March 2006 post, "Growth, crap, and spam", where I said:
There seems to be a repeating pattern with Web 2.0 sites. They start with great buzz and joy from an enthusiastic group of early adopters, then fill with crud and crap as they attract a wider, less idealistic, more mainstream audience.
See also my Jan 2006 post, "Digg, spam, and most popular lists".

[CNet article found via Matt McAlister]


Anonymous said...

Dude, you need to go a little easy
on the "See also my previous posts"
routine. You sound a little like
Donna Bogatin with her
content-free self-referential
posts on ZDNet. Though, unlike her,
you do have meaningful things
to say most of the time.

Anonymous said...

The "see also my previous posts" is one of my favorite parts of this blog. What it says to me is that Greg isn't just knee-jerk reacting to the blogosphere. It says that he has got a larger context in which he is discussing many of these issues. And, if we are interested, we can follow that context.

I wish the regular news were so considerate. I wish, when you tuned in to CBS or CNN or even CNET, they tried to provide a bit more context and perspective instead of just dumping stories at the viewers.

Greg Linden said...

Sorry about that, Anonymous. An alternative is to copy-and-paste (or slightly rephrase) things I wrote in previous posts, but I thought it was both more honest and more useful to do excerpts.

Jeremy's comment seems to indicate that at least some prefer the references to previous posts over pulling content from previous posts without the reference, but I'll think about if there is a way to do it that doesn't look so repetitive. Thanks, Anonymous.

Anonymous said...

That's cool, Greg. Sorry if
my Dude comment came across
as rude or ad hominem. I'm
glad that Findory and this
blog are around. Didn't want
you to adopt Bogatin style,
'at's all.

Elias said...

I think you are way too negative. There will always be some form of spam, blogspam, ... but overall it's much too useful to surrender because of some Spam, just like emails.

It would be interesting to know what kind of companies are successfully spamming Digg. Do you have some examples? (At least on the front pages for the different categories I've never seen any.)

You might have linked to this in one of your early posts... anyway, here is a comment arguing why why spam doesn't really work on digg.

Anonymous said...

Hey Elias - Servus! - how're things in Japan with Masataka? :-)

I guess my question would be, either to you or to Greg, what does "spam" mean in this context, anyway? If the goal (as Greg says in his next post) is to help people find the information they need, and the stories that make it to the top of Digg are stories that people need, are they really spam?

In other words, if the story is good, does it matter how it got to the top? I am not arguing one way or the other; I'm just asking the question from the evaluation perspective. If the goal or the objective function is satisfying consumer demand, what is the role of algorithmic purity?

Actually, from a true wisdom-of-crowds standpoint, a truely algorithmically pure algorithm would not let any user see any other user's diggs until the voting period was done. Independence among actors is one of the key, crucial aspects of wisdom-of-crowds. And so it is already clear that Digg isn't completely algorithmically pure; it already does not correct for non-independence biases. So how much purity does it take to be pure?

About a year ago, at an open-forum panel discussion @ Yahoo, I asked Kevin Rose a related question: "How do you know if you are good?" He stammered for a while, then tried to explain to me the notion behind Cranfield-style ad hoc retrieval evaluation (i.e. standard search evaluation). My response was, "yes, yes, I know that's how you do it for search. You take two systems, and that system with more relevant docs at the top of the list is the better. But how do you do it for Digg? When the modus operandi is people voting for stories, and displaying those stories with the most votes.. how do you know if that story is good or relevant or satisfying of user needs? How do you know if Digg is good?

He looked at me like I was a bit crazy, like I was silly to even ask a question like that. It was obvious, he felt, that Digg is good, because it shows stories that gets lots of diggs. But to me that sounded like circular reasoning. It sounded like tyranny of the majority, with no consideration for the larger number of people in the splintered, long tail minorities.

So it seems to me that before we can figure out whether spam is killing Digg, we have to be able to distinguish between (1) when spam actually gives people what they want, and (2) when spamless Digg doesn't. In other words, is Digg good?

Greg Linden said...

That's a good point, Jeremy. It is hard to know how "good" Digg is without comparing it side-by-side for clicks or user satisfaction with some other content.

Kevin might argue that, as long as people are using the site, it is useful. That probably is true, but it doesn't help determine how much more useful it could be with changes.

My post is more trying to argue that spam is getting worse and worse on Digg because of the profit motive for spammers. I am predicting (speculating?) that this will eventually overwhelm Digg, make it useless, and cause most people to abandon the site.

Anonymous said...

well, i stopped visiting digg half year ago, exactly because of the spam. and it was such a nice site...