Saturday, June 24, 2017

Two decades of Amazon.com recommendations

IEEE Internet Computing just celebrated its 20th anniversary.

On its 20th anniversary, the editorial board created its first ever “The Test of Time” award. I'm honored to say they gave it to our 2003 article, "Amazon.com Recommendations: Item-to-Item Collaborative Filtering", which continues to be accessed, cited, and used in industry and research many years after its original publication.

In addition, for the 20th anniversary issue of IEEE Internet Computing, we wrote a new article, “Two Decades of Recommender Systems at Amazon.com". Some excerpts:
For two decades now, Amazon.com has been building a store for every customer. Each person who comes to Amazon.com sees it differently ... It's as if you walked into a store and the shelves started rearranging themselves, with what you might want moving to the front, and what you're unlikely to be interested in shuffling further away.

Amazon.com launched item-based collaborative filtering in 1998, enabling recommendations at a previously unseen scale for millions of customers and a catalog of millions of items. Since we wrote about the algorithm in IEEE Internet Computing in 2003, it has seen widespread use across the Web, including YouTube, Netflix, and many others.

The algorithm's success has been from its simplicity, scalability, and often surprising and useful recommendations, as well as desirable properties such as updating immediately based on new information about a customer and being able to explain why it recommended something in a way that's easily understandable.

What was described in our 2003 IEEE Internet Computing article has faced many challenges and seen much development over the years ... We describe some of the updates, improvements, and adaptations for item-based collaborative filtering, and offer our view on what the future holds for collaborative filtering, recommender systems, and personalization.

....

What does the future hold for recommendations? ... Discovery should be like talking with a friend who knows you, knows what you like, works with you at every step, and anticipates your needs.

Recommendations and personalization live in the sea of data we all create as we move through the world, including what we find, what we discover, and what we love ... Intelligent computer algorithms leveraging collective human intelligence ... Computers helping people help other people.

The field remains wide open. An experience for every customer ... offering surprise and delight ... is a vision none have fully realized. Much opportunity remains to add intelligence and personalization to every part of every system, creating experiences that seem like a friend that knows you, what you like, and what others like, and understands what options are out there for you.

Sunday, June 11, 2017

Quick links

Some of the tech news I found interesting lately, and you might too:
  • Jeff Bezos: "Many decisions are reversible, two-way doors. Those decisions can use a light-weight process. For those, so what if you’re wrong? .... If you’re good at course correcting, being wrong may be less costly than you think, whereas being slow is going to be expensive for sure." ([1])

  • Jeff Bezos: "I would say, a lot of the value that we’re getting from machine learning is actually happening beneath the surface. It is things like improved search results. Improved product recommendations for customers. Improved forecasting for inventory management. Literally hundreds of other things beneath the surface." ([1])

  • A good summary of Mary Meeker's 2017 report. A key highlight is saturation in smartphones and internet usage. ([1])

  • New Google AI incubator: "Investment arm aimed squarely on artificial intelligence ... will operate almost like an incubator with a shared workspace for AI startups and mentorship" ([1] [2])

  • Lots of good labeled data (reliable ground truth) is the key to success with AI ([1] [2] [3] [4])

  • AI in the real world is a lot harder than ideal conditions in part because you see crazy things like robots getting attacked by humans ([1] [2])

  • "The Google [Chrome] ad-blocker will block all advertising on sites that have a certain number of 'unacceptable ads,' according to The Wall Street Journal. That includes ads that have pop-ups, auto-playing video, and 'prestitial' count-down ads that delay the display of content." ([1])

  • Nice ACM Queue article from Google SREs on availability as a combination of subservice reliability, rapid recovery, and setting expectations ([1])

  • "Designing a [software] library to reduce cognitive load is still the exception, not the rule" ([1] [2])

  • A lesson for bigger companies, investing in the long-term with your researchers, who are often working a few years ahead of what you'll need now ([1])

  • Wow: "The Melt’s blundering trajectory is instructive ... Entrepreneurs frequently embark on these missions with vast sums of money and a deep belief in technology’s power to solve all problems — which is not always a formula for success .... They were all good people, and they all wanted good things. They just didn’t know anything about running restaurants." ([1])

  • "The once-hot social network was built on the idea that people would enjoy having anonymous conversations with people close by. That’s a fantastic concept until you remember that anonymous internet person and by definition near you are scary as hell in practice." ([1])

  • Great teardown of the Juicero, includes some excellent business advice on iterative development and testing your ideas on real customers ([1] [2])

  • "When the US government discovers a vulnerability ... it can keep it secret and use it offensively ... or it can alert the software vendor and see that the vulnerability is patched, protecting the country ... Every offensive weapon is a (potential) chink in our defense." ([1])

  • On spearfishing attacks: "By a careful design and timing of a message, it should be possible to make virtually any person click" ([1] [2])

  • Schneier on forging voices: "I don't think we're ready for this. We use people's voices to authenticate them all the time, in all sorts of different ways." ([1])

  • Facebook says, "We have had to expand our security focus... to include more subtle and insidious forms of misuse, including attempts to manipulate civic discourse and deceive people" ([1] [2])

  • Remarkable and concerning that this is possible: "By accessing accelerometer and gyroscope sensors, the Web-hosted JavaScript measures subtle changes in a phone's angle, rotation, movement speed, and similar characteristics. The data, in turn, can reveal sensitive information about the phone and its user ... [including] the keystrokes being entered" ([1])

  • Nice high level description here of the difference between what Apple and Google are doing for privacy-preserving machine learning. In brief, Apple adds noise to the data to preserve privacy, but Google learns on the device then sends the updates to the machine learned models back (much like parameters servers in deep learning). The truth is they're probably both doing both, but it's still a good thing to think about. ([1])

  • Using battery backup to optimize gas power plants by being able to skip the expensive bits for gas turbines, sitting in standby because of lengthy startup times. It's easy and practical, a nice example of low hanging fruit with major impact. ([1])

  • Good data on the projected costs of energy sources ([1])

  • Good data on the newspaper industry. There's a curious spike in ad revenue from 1980-2000 that isn't matched by subscriptions. ([1])

  • Jeff Bezos is making journalism profitable: "The Post has said that it was profitable last year — and not through cost-cutting ... The Post has gone on a hiring spree. It has hired hundreds of reporters and editors and has more than tripled its technology staff ... third straight year of double-digit revenue growth ... 'You have to be great at technology. You have to be great at monetization. But one thing I think we’re proving is that if you are, great journalism can be profitable.'" ([1])

  • How Google took over the classroom, great article, but misses that the failure of iPads was a big piece of this ([1] [2])

  • Duolingo's excellent efforts to help people learn English, which can be a tool for economic or educational advancement ([1])

  • Amazon Web Services cuts prices again, remarkable ([1])

  • Almost all cloud workloads right now are not cloud optimized, so the customers mostly moved a system built for fixed hardware resources to the cloud and then run idle a lot rather than redesigning to optimize with dynamically scaling ([1])

  • Latest version of Google Earth is impressive, definitely worth trying ([1])

  • Brent Smith and I received the first ever IEEE Internet Computing Test of Time award for our 2003 paper on Amazon's recommender system. In a new article for the IEEE Internet Computing 20th Anniversary Issue, we look back at the last two decades. ([1])

  • A virtual reality game that succeeds at taking advantage of what it can do well and what it can't to create a fully immersive experience ([1])

  • Somehow, I missed that Chris Sacca is retiring. Amazing career and influence he had, and impressive to decide to go an entire new direction now. ([1])

  • In a Stack Overflow survey, what software engineers care about, it's who they work with, what they are doing, and what they learn far more than salary. In the top five items, three are about who you work with and what you learn, one is benefits, and one is commute. But the benefits are complicated -- it's not salary, stock, and bonus -- but the top items all things related to work environment and commute, vacation, and health care. ([1])

  • Great interview with the CEO of Coursera: "Humility and the ability to listen well are the big things I look for ... If you want to understand people, you need to hear them ... [Also have] ambitious goals to lift the organization up and everybody with it. Setting goals that are ambitious but also achievable is an important skill." ([1])

  • Great quote from Jeff: "At Amazon, we've had a lot of inventions that we were very excited about, and customers didn't care at all. And believe me, those inventions were not disruptive in any way. The only thing that's disruptive is customer adoption." ([1])

  • Nice line in Dan Ariely's book Payoff: "If you really want to demotivate people, shredding their work is the way to go, but ... you can get almost all the way there simply by ignoring their efforts." ([1])

  • Xkcd points out minor changes in methodology yield radical changes in data visualizations of most unusually popular activity in a location ([1])

  • Xkcd on machine learning, disturbingly close to reality ([1])

  • Xkcd on hard problems ([1])

  • Xkcd on survivorship bias ([1])

  • Xkcd on unhelpful code reviews ([1])

  • Very funny that Burger King ran an ad with "OK, Google" and it works. Once again Xkcd was hilariously prescient about this. ([1] [2])

  • SMBC comic on bayesian inference: "Given his low priors..." ([1])

  • SMBC comic: "Then it occurred to me, hey, I've got like a sample size of one here, and it's not double blind." ([1])

  • SMBC comic on behavioral economics ([1])

  • SMBC comic: "Wait, are you going to turn my life's work into a joke about butts or something?" [1])