Friday, November 17, 2023

Book excerpt: How companies build algorithms using experimentation

(This is an excerpt from drafts of my book, "Algorithms and Misinformation: Why Wisdom of the Crowds Failed the Internet and How to Fix It")

Wisdom of the crowd algorithms shape what people see on the internet. Constant online experimentation shapes what wisdom of the crowd algorithms do.

Wisdom of crowds is the idea that summarizing the opinions of lots of independent people is often useful. Many machine learning algorithms use wisdom of the crowds, including rankers, trending, and recommenders on social media.

It's important to realize that recommendations algorithms are not magic. They don't come up with good recommendations out of thin air. Instead, they just summarize what people found.

If summarizing what people found is all the algorithms do, why do they create harm? Why would algorithms amplify social media posts about scammy vitamin supplements? Why would algorithms show videos from white supremacists?

It is not how the algorithms are built, but how they are optimized. Companies change, twiddle, and optimize algorithms over long periods of time using online experiments called A/B tests. In A/B tests, some customers see version A of the website and some customers see version B.

Teams compare the two versions. Whichever version performs better, by whatever metrics the company chooses, is the version that later launches for all customers. This process repeats and repeats, slowly increasing the metrics.

Internet companies run tens of thousands of these online experiments every year. The algorithms are constantly tested, changing, and improving, getting closer and closer to the target. But what if you have the wrong target? If the goal is wrong, what the algorithms do will be wrong.

Let’s say you are at Facebook working on the news feed algorithm. The news feed algorithm is what picks what posts people see when they come to Facebook. And let’s say you are told to optimize the news feed for what gets the most clicks, likes, and reshares. What do you do? You will start trying changes to the algorithm and A/B testing them. Does this change get more clicks? What about this one? Through trial-and-error, you will find whatever makes the news feed get more engagement.

It is this trial-and-error process of A/B testing that drives what the algorithms do. Whatever the goal is, whatever the target, teams of software engineers will work hard to twiddle the algorithms to hit those goals. If your goal is the wrong goal, your algorithms will slowly creep toward doing the wrong thing.

So what gets the most clicks? It turns out scams, hate, and lies get a lot of clicks. Misinformation tends to provoke a strong emotional reaction. When people get angry, they click. Click, click, click.

And if your optimization process is craving clicks, it will show more of whatever gets clicks. Optimizing algorithms for clicks is what causes algorithms to amplify misinformation on the internet.

To find practical solutions, it's important to understand how powerful tech companies build their algorithms. It's not what you would expect.

Algorithms aren't invented so much as evolved. These algorithms are optimized over long periods of time, changing slowly to maximize metrics. That means the algorithms can unintentionally start causing harm.

No comments: