Tuesday, November 28, 2023

Book excerpt: The problem is bad incentives

(This is an excerpt from drafts of my book, "Algorithms and Misinformation: Why Wisdom of the Crowds Failed the Internet and How to Fix It")

Incentives matter. “As long as your goal is creating more engagement,” said former Facebook data scientist Francis Haugen in a 60 Minutes interview, “you’re going to continue prioritizing polarizing, hateful content.”

Teams inside of the tech companies determine how the algorithms are optimized and what the algorithms amplify. People in teams optimize those algorithms for whatever goals they are given. Metrics and incentives the teams have inside the tech companies determine how wisdom of the crowd algorithms are optimized over time.

What the company decides is important and rewards determines how the algorithms are tuned. Metrics determine what wins A/B tests. Metrics decide what changes get launched to customers. Metrics determine who gets promoted inside these companies. When a company creates bad incentives by picking bad metrics, the algorithms will produce bad results.

What Facebook’s leadership prioritizes and rewards determines what people see on Facebook. “Facebook’s algorithm isn’t a runaway train,” Haugen said. “The company may not directly control what any given user posts, but by choosing which types of posts will be seen, it sculpts the information landscape according to its business priorities.” What the executives prioritize in what they measure and reward determines what types of posts people see on Facebook. You get what you measure.

“Mark has never set out to make a hateful platform. But he has allowed choices to be made where the side effects of those choices are that hateful, polarizing content gets more distribution and more reach,” Haugen said. Disinformation, misinformation, and scams on social media are “the consequences of how Facebook is picking out that content today.” The algorithms are “optimizing for content that gets engagement, or reaction.”

Who gets that quarterly bonus? It’s hard to have a long-term focus when the company offers large quarterly bonuses for hitting short-term engagement targets. In No Rules Rules, Netflix co-founder and CEO Reed Hastings wrote, “We learned that bonuses are bad for business.” He went on to say that executives are terrible at setting the right metrics for the bonuses and, even if they do, “the risk is that employees will focus on a target instead of spot what’s best for the company.”

Hastings said that “big salaries, not merit bonuses, are good for innovation” and that Netflix does not use “pay-per-performance bonuses.” Though “many imagine you lose your competitive edge if you don’t offer a bonus,” he said, “We have found the contrary: we gain a competitive edge in attracting the best because we just pay all that money in salary.”

At considerable effort, Google, Netflix, and Spotify have shown that, properly measured in long experiments, short-term metrics such as engagement or revenue hurt the company in the long-run. For example, in a paper titled “Focus on the Long-term: It’s Better for Users and Business”, Google showed that optimizing for weekly ad revenue would result in far too many ads in the product to maximize Google’s long-term ad revenue. Short-term metrics miss the most important goals for a company: growth, retention, and long-term profitability.

Short-term metrics and incentives overoptimize for immediate gains and ignore long-term costs. While companies and executives should have enough reasons to avoid bad incentives and metrics that hurt the company in the long-term, it is also true that regulators and governments could step in to encourage the right behaviors. As Foreign Policy wrote when talking about democracies protecting themselves from adversarial state actors, regulators could encourage social media companies to think beyond the next quarterly earnings report.

Regulators have struggled to understand how to help. Could they directly regulate algorithms? Attempts to do so have immediately hit the difficulty of crafting useful regulations for machine learning algorithms. But the problem is not the algorithm. The problem is people.

Companies want to make money. Many scammers and other bad actors also want to make money. The money is in the advertising.

Fortunately, the online ad marketplace already has a history of being regulated in many countries. Regulators in many countries already maintain bans on certain types of ads, restrictions on some ads, and financial reporting requirements for advertising. Go after the money and you change the incentives.

Among those suggesting increasing regulation on social media advertising is the Aspen Institute Commission on Information Disorder. In their report, they suggest countries “require social media companies to regularly disclose ... information about every digital ad and paid post that runs on their platforms [and then] create a legal requirement for all social media platforms to regularly publish the content, source accounts, reach and impression data for posts that they organically deliver to large audiences.”

This would provide transparency to investors, the press, government regulators, and the public, allowing problems to be seen far earlier, and providing a much stronger incentive for companies themselves to prevent problems before having them disclosed.

The Commission on Information Disorder goes further, suggesting that, in the United States, the extension of Section 230 protections to advertising and algorithms that promote content is overly broad. They argue any content that is featured, either by paid placement advertising or by recommendation algorithms, should be more heavily scrutinized: “First, withdraw platform immunity for content that is promoted through paid advertising and post promotion. Second, remove immunity as it relates to the implementation of product features, recommendation engines, and design.”

Their report was authored by some of the world experts on misinformation and disinformation. They say that “tech platforms should have the same liability for ad content as television networks or newspapers, which would require them to take appropriate steps to ensure that they meet the established standards for paid advertising in other industries.” They also say that “the output of recommendation algorithms” should not be considered user speech, which would enforce a “higher standard of care” when the company’s algorithms get shilled and amplify content “beyond organic reach.”

These changes would provide strong incentives for companies to prevent misinformation and propaganda in their products. The limitations on advertising would reduce the effectiveness of using advertising in disinformation campaigns. It also would reduce the effectiveness of spammers who opportunistically pile on disinformation campaigns, cutting into their efficiency and profitability. Raising the costs and reducing the efficiency of shilling will reduce the amount of misinformation on the platform.

Subject internet companies to the same regulations on advertising that television networks and newspapers have. Regulators are already familiar with following the money, and even faster enforcement and larger penalties for existing laws would help. By changing where revenue comes from, it may encourage better incentives and metrics within tech companies.

“Metrics can exert a kind of tyranny,” former Amazon VP Neil Roseman said in our interview. Often teams “don’t know how to measure a good customer experience.” And different teams may have “metrics that work against each other at times” because simpler and short-term term metrics often “narrow executive focus to measurable input/outputs of single systems.” A big problem is that “retention (and long-term value) are long-term goals which, while acknowledged, are just harder for people to respond to than short-term.”

Good incentives and metrics focus on the long-term. Short-term incentives and metrics can create a negative feedback loop as algorithms are optimized over time. Good incentives and metrics focus on what is important to the business, long-term retention and growth.

Monday, November 27, 2023

Tim O'Reilly on algorithmic tuning for exploitation

Tim O'Reilly, Mariana Mazzucato, and Ilan Strauss have three working papers focusing on Amazon's ability to extract unusual profits from its customers nowadays. The papers are: The core idea in all three is that Amazon has become the default place to shop online for many. So, when Amazon changes their site in ways that make Amazon higher profits but hurt consumers, it takes work for people to figure that out and shop elsewhere.

The papers criticize the common assumption that people will quickly switch to shopping elsewhere if the Amazon customer experience deteriorates. Realistically, people are busy. People have imperfect information, limited time, and it is effortful to find another place to shop. At least up to some limit, people may tolerate a familiar but substantially deteroriated experience for some time.

For search, it takes effort for people to notice that they are being shown lots of ads, that less reliable third party sellers are promoted over less profitable but more relevant options, and that the most useful options aren't always first. And then it takes yet more effort to switch to using other online stores. So Amazon is able to extract extraordinary profits in ways less dominant online retailers can't get away with.

But I do have questions about how far Amazon can push this. How long can Amazon get away with excessive advertising and lower quality? Do consumers tire of it over time and move on? Or do they put up with it forever as long as the pain is below some threshold?

Take an absurd extreme. Imagine that Amazon thought it could maximize its revenue and profits by showing only ads and only the most profitable ads for any search regardless of the relevance of those ads to the search. Clearly, that extreme would not work. The search would be completely useless and consumers would go elsewhere very rapidly.

Now back off from that extreme, adding back more relevant ads and more organic results. At what point do consumers stay at Amazon? And do they just stay at Amazon or do they slowly trickle away?

I agree time and cognitive effort, as well as Amazon Prime renewing annually, raise switching costs. But when will consumers have had enough? Do consumers only continue using Amazon with all the ads until they realize the quality has changed? When does brand and reputation damage accumulate to the point that consumers start trusting Amazon less, shopping at Amazon less, and expending the effort of trying alternatives?

I think one model of customer attrition is that every time customers notice a bad experience, they have some probability of using Amazon less in the future. The more bad experiences they have, the faster the damage to long-term revenue. Under this model, even the level of ads Amazon has now is causing slow damage to Amazon. Amazon execs may not notice because the damage is over long periods of time and hard to attribute directly back to the poor quality search results, but the damage is there. This is the model I've seen used by some others, such as Google Research in their "Focus on the Long-term" paper.

Another model might be that consumers are captured by dominant companies such as Amazon and will not pay the costs to switch until they hit some threshold. That is, most customers will refuse to try alternatives until it is completely obvious that it is worth the effort. This assumes that Amazon can exploit customers for a very long time, and that customers will not stop using Amazon no matter what they do. There is some extreme where that breaks, but only at the threshold, not before.

The difference between these two models matters a lot. If Amazon is experiencing substantial but slow costs from what they are doing right now, there's much more hope for them changing their behavior on their own than if Amazon is experiencing no costs from their bad behavior unless regulators impose costs externally. The solutions you get in the two scenarios are likely to be different.

I enjoyed the papers and found them thought-provoking. Give the papers a read, especially if you are interested in the recent discussions of enshittification started by Cory Doctorow. As Cory points out, this is a much broader problem than just Amazon. And we need practical solutions that companies, consumers, and policy makers can actually implement.

Sunday, November 26, 2023

Book excerpt: People determine what the algorithms do

(This is an excerpt from drafts of my book, "Algorithms and Misinformation: Why Wisdom of the Crowds Failed the Internet and How to Fix It")

The problem is people. These algorithms are built, tuned, and optimized by people. The incentives people have determine what these algorithms do.

If what wins A/B tests is what gets the most clicks, people will optimize the algorithms to get more clicks. If a company hands out bonuses and promotions when the algorithms get more clicks, people will tune the algorithms to get more clicks.

It doesn’t matter that what gets clicks and engagement may not be good for customers or the company in the long-term. Lies, scams, and disinformation can be very engaging. Fake crowds generate a lot of clicks. None of them are real, true, and none of them help customers or the business, but look at all those click, click, clicks.

Identifying the right problem is the first step toward finding the right solutions. The problem is not algorithms. The problem is how people optimize the algorithms. Lies, scams, and disinformation thrive if people optimize for the short-term. Problems like misinformation are a symptom of a system that invites these problems.

Instead, invest in the long-term. Invest in removing fake crowds and in a good customer experience that keeps people around. Like any investment, this means lower profits in the short-term for higher profits in the long-term. Companies maximize long-term profitability by making sure teams are optimizing for customer satisfaction and retention.

It’s not the algorithm, it’s people. People are in control. People tune the algorithm to cause harm usually unintentionally and sometimes because they have incentives to ignore the harm. The algorithm does what people tell it to.

To fix why the algorithms cause harm, look to the people who build the algorithms. Fixing the harm from wisdom of the crowd algorithms requires fixing why people allow those algorithms to cause harm.

Friday, November 17, 2023

Book excerpt: How companies build algorithms using experimentation

(This is an excerpt from drafts of my book, "Algorithms and Misinformation: Why Wisdom of the Crowds Failed the Internet and How to Fix It")

Wisdom of the crowd algorithms shape what people see on the internet. Constant online experimentation shapes what wisdom of the crowd algorithms do.

Wisdom of crowds is the idea that summarizing the opinions of lots of independent people is often useful. Many machine learning algorithms use wisdom of the crowds, including rankers, trending, and recommenders on social media.

It's important to realize that recommendations algorithms are not magic. They don't come up with good recommendations out of thin air. Instead, they just summarize what people found.

If summarizing what people found is all the algorithms do, why do they create harm? Why would algorithms amplify social media posts about scammy vitamin supplements? Why would algorithms show videos from white supremacists?

It is not how the algorithms are built, but how they are optimized. Companies change, twiddle, and optimize algorithms over long periods of time using online experiments called A/B tests. In A/B tests, some customers see version A of the website and some customers see version B.

Teams compare the two versions. Whichever version performs better, by whatever metrics the company chooses, is the version that later launches for all customers. This process repeats and repeats, slowly increasing the metrics.

Internet companies run tens of thousands of these online experiments every year. The algorithms are constantly tested, changing, and improving, getting closer and closer to the target. But what if you have the wrong target? If the goal is wrong, what the algorithms do will be wrong.

Let’s say you are at Facebook working on the news feed algorithm. The news feed algorithm is what picks what posts people see when they come to Facebook. And let’s say you are told to optimize the news feed for what gets the most clicks, likes, and reshares. What do you do? You will start trying changes to the algorithm and A/B testing them. Does this change get more clicks? What about this one? Through trial-and-error, you will find whatever makes the news feed get more engagement.

It is this trial-and-error process of A/B testing that drives what the algorithms do. Whatever the goal is, whatever the target, teams of software engineers will work hard to twiddle the algorithms to hit those goals. If your goal is the wrong goal, your algorithms will slowly creep toward doing the wrong thing.

So what gets the most clicks? It turns out scams, hate, and lies get a lot of clicks. Misinformation tends to provoke a strong emotional reaction. When people get angry, they click. Click, click, click.

And if your optimization process is craving clicks, it will show more of whatever gets clicks. Optimizing algorithms for clicks is what causes algorithms to amplify misinformation on the internet.

To find practical solutions, it's important to understand how powerful tech companies build their algorithms. It's not what you would expect.

Algorithms aren't invented so much as evolved. These algorithms are optimized over long periods of time, changing slowly to maximize metrics. That means the algorithms can unintentionally start causing harm.

It's easy for social media to fill with astroturf

Most underestimate how easy it is for social media to become dominated by astroturf. It's easy. All you need is a few people creating and controlling multiple accounts. Here's an example.

Let's say you have 100M real people using your social media site. On average, most post or comment infrequently, once every 10 days. That looks like real social media activity from real people. Most people lurk, a few people post a lot.

Now let's say 1% of people shill their own posts using about 10 accounts they control on average. These accounts also post and comment more frequently, once a day. Most of these use a few burner accounts to like, share, and comment on their own posts. Some use paid services and unleash hundreds of bots to shill for them.

In this simple example, about 50% of comments and posts you see on the social media site will be artificially amplified by fake crowds. Astroturfed posts and comments will be everywhere. This is because most people don't post often, and the shills are much more active.

Play with the numbers. You'll find that if most people don't post or comment -- and most real people don't -- it's easy for people who post a lot from multiple accounts they control to dominate conversations and feign popularity. It's like a megaphone for social media.

Also important is how hard it is for the business to fix astroturf once they (often unintentionally) go down this path. This example social media site has 100M people using it, but claims about 110M users. Real engagement is much smaller with fewer highly engaged accounts, not what this business pitches to advertisers. Once you have allowed this problem to grow, it's tempting for companies finding themselves in this situation to not fix it.

Wednesday, November 15, 2023

Book excerpt: How some companies get it right

(This is an excerpt from drafts of my book, "Algorithms and Misinformation: Why Wisdom of the Crowds Failed the Internet and How to Fix It")

How do some companies fix their algorithms? In the last decade, wisdom of the crowds broke, corrupted by bad actors. But some found fixes that let them still use wisdom of the crowds.

Why was Wikipedia resilient to spammers and shills when Facebook and Twitter were not? Diving into how Wikipedia works, this book shows that Wikipedia is not a freewheeling anarchy of wild edits by anyone, but a place where the most reliable and trusted editors have most of the power. A small percentage of dedicated Wikipedia editors have much more control over Wikipedia than the others; their vigilance is the key to keeping out scammers and propagandists.

It's well known that When Larry Page and Sergey Brin first created Google, they invented the PageRank algorithm. Widely considered a breakthrough at the time, PageRank used links between web pages as if they were votes for what was interesting and popular. PageRank says a web page is useful if it has a lot of other useful web pages pointing to it.

Less widely known is that PageRank quickly succumbed to spam. Spammers created millions of web pages with millions of links all pointing to each other, deceiving the PageRank algorithm. Because of spam and manipulation, Google quickly replaced PageRank with the much more resilient TrustRank.

TrustRank only considers links from reliable and trustworthy web pages and mostly ignores links from unknown or untrusted sources. It works by propagating trust along links between web pages from known trusted pages to other pages. TrustRank made manipulating Google's search ranking algorithm much less effective and much more expensive for scammers.

TrustRank also works for social media. Start by identifying thousands of accounts that are known to be reliable, meaning that they are real people posting useful information, and thousands of accounts that are unreliable, meaning that they are known to be spammers and scammers. Then look at the accounts that those accounts follow, like, reshare, or engage with in any way. Those nearby accounts then get a bit of the goodness or badness, spreading through the engagement network. Repeat this over and over, allowing reliability and unreliability to spread across all the accounts, and you know how reliable most accounts are even if they are anonymous.

If you boost reliable accounts and mostly ignore unknown and unreliable accounts, fake accounts become less influential, and it becomes much less cost-effective for bad actors to create influential fake accounts.

Companies that fixed their wisdom of the crowd algorithms also do not use engagement to optimize their algorithms. Optimizing for engagement will cause wisdom of the crowd algorithms to promote scams, spam, and misinformation. Lies get clicks.

It’s a lot of work to not optimize for engagement. Companies like Netflix, Google, YouTube, and Spotify put in considerable effort to run long experiments, often measuring people over months or even years. They then develop proxy short-term metrics that they can use to measure long-term satisfaction and retention over shorter periods of time. One example is satisfied clicks, which are clicks where people are not immediately repelled and spend time using the content they see, ignoring clicks on scams or other low quality content. These companies put in all this effort to develop good metrics because they know that optimizing algorithms for engagement eventually will hurt the company.

Algorithms can be fixed if executives leading the companies decide to fix them. Some companies have successfully prevented bad actors from manipulating wisdom of the crowds. The surprise: Companies make more much more money over the long-run if they don't optimize algorithms for clicks.

Thursday, November 09, 2023

Book excerpt: Table of Contents

(This is the Table of Contents from my book, "Algorithms and Misinformation: Why Wisdom of the Crowds Failed the Internet and How to Fix It")

Introduction: How good algorithms became a fountain of scams, shills, and disinformation — and what to do about it

Part I: The golden age of wisdom of crowds algorithms
Chapter 1: The rise of helpful algorithms
Chapter 2: How companies build algorithms using experimentation

Part II: The problem is not the algorithms
Chapter 3: Bad metrics: What gets measured gets done
Chapter 4: Bad incentives: What gets rewarded gets replicated
Chapter 5: Bad actors: The irresistible lure of an unlocked house

Part III: How to stop algorithms from amplifying misinformation
Chapter 6: How some companies get it right
Chapter 7: How to solve the problems with the algorithms
Chapter 8: Getting platforms to embrace long-term incentives and metrics
Chapter 9: Building a win-win-win for companies, users, and society

Conclusion: From hope to despair and back to hope

(That was the Table of Contents from a draft of my book. If you might be interested in this book, I'd love to know.)