Tuesday, October 21, 2008

Attacking tagging systems

A paper at WebKDD 2008, "Exploring the Impact of Profile Injection Attacks in Social Tagging Systems" (PDF), by Maryam Ramezani, JJ Sandvig, Runa Bhaumik, and Bamshad Mobasher claims that social tagging systems like del.icio.us are easy to attack and manipulate.

The paper looks at two types of attacks on social tagging systems, one that attempts to make a document show up on tag searches where it otherwise would not show up, another that attempts to promote a document by associating it with other documents.

Some excerpts:
The goal of our research is to answer questions such as ... How many malicious users can a tagging system tolerate before results significantly degrade? How much effort and knowledge is needed by an attacker?

We describe two attack types in detail and study their impact on the system .... The goal of an overload attack, as the name implies, is to overload a tag context with a target resource so that the system correlates the tag and the resource highly ... thereby increasing traffic to the target resource ... The goal of a piggyback attack is for a target resource to ride the success of another resource ... such that they appear similar.

Our results show that tagging systems are quite vulnerable to attack ... A goal-oriented attack which targets a specific user group can easily be injected into the system ... Low frequency URLs are vulnerable to piggyback attack as well as popular and focused overload attacks. High frequency URLs ... are [still] vulnerable to overload attacks.
The paper goes on to describe a few methods of attacking social tagging systems that require creating remarkably few fake accounts, as few as 0.03% of the total accounts in the system.

Frankly, I have been surprised not to see more attacks on tagging systems. It may be the case that most of these sites lack a large, mainstream audience, so the profit motive is still not sufficiently high to motivate persistent attacks.

Please see also my earlier post, "Attacking recommender systems", that discusses another paper by some of the same authors.

1 comment:

Justin Mason said...

I understand Joshua Schachter deliberately blocked robots, including the Googlebot, from del.icio.us so that it wouldn't be a target for spammers. Doubtless there _are_ spammers there anyway, but I would imagine that has helped reduce its profile for them.