Sunday, January 07, 2024

Book excerpt: Use only trustworthy behavior data

(This is an excerpt from drafts of my book, "Algorithms and Misinformation: Why Wisdom of the Crowds Failed the Internet and How to Fix It")

Adversaries manipulate wisdom of crowds algorithms by controlling a crowd of accounts.

Their controlled accounts can then coordinate to shill whatever they like, shout down opposing views, and create an overwhelming flood of propaganda that makes it hard for real people to find real information in the sea of noise.

The Aspen Institute Commission, in a report titled Commission on Information Disorder, suggests the problem is often confined to a surprisingly small number of accounts, amplified by coordinated activity from other controlled accounts.

They describe how it works: “Research reveals that a small number of people and/or organizations are responsible for a vast proportion of misinformation (aka ‘superspreaders’) ... deploying bots to promote their content ... Some of the most virulent propagators of falsehood are those with the highest profile [who are often] held to a lower standard of accountability than others ... Many of these merchants of doubt care less about whether they lie, than whether they successfully persuade, either with twisted facts or outright lies.”

The authors of this report offer a solution. They suggest that these manipulative accounts should not be amplified by algorithms, making the spreading of misinformation much more costly and much more difficult to do efficiently.

Specifically, they argue social media companies and government regulators should “hold superspreaders of mis- and disinformation to account with clear, transparent, and consistently applied policies that enable quicker, more decisive actions and penalties, commensurate with their impacts — regardless of location, or political views, or role in society.”

Because just a few accounts, supported by substantial networks of controlled shill accounts, are the problem, they add that social media should focus “on highly visible accounts that repeatedly spread harmful misinformation that can lead to significant harms.”

Problems with adversaries manipulating, shilling, and spamming have a long history. One way to figure out how to solve the problem is to look at how others mitigated these issues in the past.

Particularly helpful are the solutions for web spam. As described in the research paper "Web Spam Detection with Anti-Trust Rank", web spam is “artificially making a webpage appear in the top results to various queries on a search engine.” The web spam problem is essentially the same problem faced by social media rankers and recommenders. Spammers manipulate the data that ranking and recommender algorithms use to determine what content to surface and amplify.

The researchers described how bad actors create web spam: “A very common example ... [is] creating link farms, where webpages mutually reinforce each other ... [This] link spamming also includes ... putting links from accessible pages to the spam page, such as posting web links on publicly accessible blogs.”

This is essentially the same techniques used by adversaries for social media; adversaries use controlled accounts and bots to post, reshare, and like content, reinforcing how popular it appears.

To fix misinformation on social media, learn from what has worked elsewhere. TrustRank is a popular and widely used technique in web search engines to reduce the efficiency, effectiveness, and prevalence of web spam. It “effectively removes most of the spam” without negatively impacting non-spam content.

How does it work? “By exploiting the intuition that good pages -- i.e. those of high quality -- are very unlikely to point to spam pages or pages of low quality.”

The idea behind TrustRank is to start from the trustworthy and view the actions of those trustworthy people to also be likely to be trustworthy. Trusted accounts link to, like, share, and post information that is trustworthy. Everything they say is trustworthy is now mostly trustworthy too, and the process repeats. In this way, trust gradually propagates out from a seed of known reliable accounts to others.

As the "Combating Web Spam with TrustRank" researchers put it, “We first select a small set of seed pages to be evaluated by an expert. Once we manually identify the reputable seed pages, we use the link structure of the web to discover other pages that are likely to be good ... The algorithm identifies other pages that are likely to be good based on their connectivity with the good seed pages.”

TrustRank works for web spam in web search engines. “We can effectively filter out spam from a significant fraction of the web, based on a good seed set of less than 200 sites.” Later work suggested adding in Anti-Trust Rank has some benefits as well, which works by taking a set of known untrustworthy people who have a history of spamming, shilling, and attempting to manipulate the ranker algorithms, then assuming that everything they have touched are all also likely to be untrustworthy.

In social media, much of the problem is not that bad content exists at all, but that bad content is amplified by algorithms. Specifically, rankers and recommenders on social media look at likes, shares, and posts, then think that shilled content is popular, so the algorithms share the shilled content with others.

The way this works, both for web search and for social media, is that wisdom of the crowd algorithms including rankers and recommenders count votes. A link, like, click, purchase, rating, or share is a vote that a piece of content is useful, interesting, or good. What is popular or trending is what gets the most votes.

Counting votes in this way easily can be manipulated by people who create or use many controlled accounts. Bad actors vote many times, effectively stuffing the ballot box, to get what they want on top.

If wisdom of crowds only uses trustworthy data from trustworthy accounts, shilling, spamming, and manipulation becomes much more difficult.

Only accounts known to be trustworthy should matter for what is considered popular. Known untrustworthy accounts with a history of being involved in propaganda and shilling should have their content hidden or ignored. And unknown accounts, such as brand new accounts or accounts that have no connection to trustworthy accounts, also should be ignored as potentially harmful and not worth the risk of including.

Wisdom of the trustworthy dramatically raises the costs for adversaries. No longer can a few dozen accounts, acting together, successfully shill content.

Now, only trustworthy accounts amplify. And because trust is hard to gain and easily lost, disinformation campaigns, propaganda, shilling, and spamming often become cost prohibitive for adversaries.

As Harvard fellow and security expert Bruce Schneier wrote in a piece for Foreign Policy titled “8 Ways to Stay Ahead of Influence Operations,” the problem is recognizing these fake accounts that are all acting together in a coordinated way to manipulate the algorithms and not using their data to inform ranker and recommender algorithms.

Schneier wrote, “Social media companies need to detect and delete accounts belonging to propagandists as well as bots and groups run by those propagandists. Troll farms exhibit particular behaviors that the platforms need to be able to recognize.”

Shills and trolls are shilling and trolling. That is not normal human behavior.

Real humans don’t all act together, at the same time, to like and share some new content. Real humans cannot act many times per second or vote on content they have never seen. Real humans cannot all like and share content from a pundit as soon as it appears and then all do it again exactly in the same way for the next piece of content from that pundit.

When bad actors use controlled fake accounts to stuff the ballot box, the behavior is blatantly not normal.

There are a lot of accounts in social media today that are being used to manipulate the wisdom of the crowd algorithms. Their clicks, likes, and shares are bogus and should not be used by the algorithms.

Researchers in Finland studying the phenomenon back in 2021 wrote that “5-10% of Twitter accounts are bots and responsible for the generation of 20-25% of all tweets.” The researchers describe these compromised accounts as “cyborgs” and write that they “have characteristics of both human-generated and bot-generated accounts."

These controlled accounts are unusually active, producing a far larger percentage of all tweets than the percentage of accounts they represent. This also was a low estimate on the total amount of manipulated accounts in social media as it did not include compromised accounts, accounts that are paid to shill, or accounts paid to disclose their password so they can sometimes be used by someone else to shill.

Because bad actors using accounts to spam and shill must quickly act in concert to spam and shill, and often do so repeatedly with the same accounts, their behavior is not normal. Their unusually active and unusually timed actions can be detected.

One detection tool published by researchers at the American Association for Artificial Intelligence (AAAI) conference was a “classifier ... capturing the local and global variations of observed characteristics along the propagation path ... The proposed model detected fake news within 5 min of its spread with 92 percent accuracy for Weibo and 85 percent accuracy for Twitter.”

Professor Kate Starbird, who runs a research group studying disinformation at University of Washington, wrote how social media companies have taken exactly the wrong approach, exempting prominent accounts associated with misinformation, disinformation, and propaganda rather than subjecting them and their shills to skepticism and scrutiny. Starbird wrote, “Research shows that a small number of accounts have outsized impact on the spread of harmful misinfo (e.g. around vaccines and false/misleading claims of voter fraud). Instead of whitelisting these prominent accounts, they should be held to higher levels of scrutiny and accountability.”

Researchers have explained the problem, being willing to amplify anything that isn’t provably bad rather than only amplifying that which is known to be trustworthy. In a piece titled Computational Propaganda, Stanford Internet Observatory researcher Renee DiResta wrote, “Our commitment to free speech has rendered us hesitant to take down disinformation and propaganda until it is conclusively and concretely identified as such beyond a reasonable doubt. That hesitation gives ... propagandists an opportunity.”

The hesitation is problematic, as it makes it easy to manipulate wisdom of crowds algorithms. “Incentive structures, design decisions, and technology have delivered a manipulatable system that is being gamed by propagandists,” DiResta said. “Social algorithms are designed to amplify what people are talking about, and popularity is ... easy to feign.”

Rather than starting from the assumption that every account is real, the algorithms should start with the assumption that every account is fake.

Only provably trustworthy accounts should be used by wisdom of the crowd algorithms such as trending, rankers, and recommenders. When considering what is popular, not only should fake accounts coordinating to shill be ignored, but also there should be considerable skepticism toward new accounts that have not been proven to be independent of the others.

With wisdom of crowds algorithms, rather than think of which accounts should be banned and not used, consider the minimum number of trustworthy accounts needed to not lower the perceived quality of the recommendations. There is no reason to use all the data when the biggest problem is shilled and untrustworthy data.

Companies are playing whack-a-mole with bad actors who just create new accounts or find new shills every time they’re whacked because it’s so profitable -- like free advertising -- to create fake crowds that manipulate the algorithms.

Propagandists and scammers are loving it and winning. It’s easy and lucrative for them.

Rather than classify accounts as spam, classify accounts as trustworthy. Only use trustworthy data as input to the algorithms, ignoring anything unknown or borderline as well as known spammers and shills.

Toss big data happily, anything suspicious at all. Do not be concerned about false positives galore accidentally marking new accounts or borderline accounts as shills when deciding what to input to the recommender algorithms. None of that matters if it does not reduce the perceived quality of the recommendations.

As with web spam and e-mail spam, the goal isn’t eliminating manipulation, coordination, disinformation, scams, and propaganda.

The goal is raising the costs on adversaries, ideally to the point where most of it is no longer cost-effective. If bad actors no longer find it easy and effective to try to manipulate recommender systems on social media, most will stop.

No comments: