Subtitles section Play video Print subtitles Welcome back to Mining of Massive Datasets. Today's topic is Recommender Systems. We're going to start with an overview of recommendation systems, and why they are necessary. Today, we are going to look at the two most common types of Recommender Systems. Content-Based Systems and Collaborative Filtering. And finally we're going to look at how to evaluate Recommender Systems, to make sure they're doing a good job. Let's start with an overview. Imagine any situation where a user interacts with a really large catalog of items. Now these items could be products at Amazon. They could be movies at Netflix. They could be music from Pandora's catalog. Or they could be the news items on Google News. What really matters, is that there are tens of thousands, or hundreds of thousands, or millions of items. A really large catalog. And the user is interacting with this catalog. There's two ways in which a user can interact with a large catalog of items. The first is such, user knows what they're looking for, and they go and they search the catalog for the precise item that they're looking for. Now when you have a very large catalog of items, very often the user doesn't know exactly what they're looking for. And this is where recommendations come in. The system recommends to the user certain items that they think the user will be interested in, based on what they know about the user. Now why do we really need such recommendations? The key that made recommendations so important and why a recommendation system developed so much in the last ten or 20 years. If that he moved from an area scarcity to an area of abundance. What do I mean by this? Imagine that you were out shopping 20 years ago, and you'd go to a local retailer, and you'll find a certain number of products on the shelves of the local retailer. Now, even in the really large retailer like, like a Wal Mart, for instance. Shelf space is a key, is a scarce commodity. It limits the number of items that a retailer can carry. Shelf space is expensive, because it involves real estate costs. And, therefore, a retailer can carry only a certain number of products. Now, a similar situation applies in the case of, for example, TV networks. A TV network can carry only so many shows, because there's only so many hours in a day. And there are only so many movie theaters, so they can only ser, screen a certain number of movies. Now once the internet was developed, things changed. The web enables zero-cost dissemination of information about products. And what this means, is that we can have many more products than ever before. There is no shelf space limitation on the number of products. That's why the number of products on Amazon is much, much more than the number of products available at any physical retailer. The number of you know, movies available on Netflix is more than the number of movies that have been available, available at Blockbuster and so on. This near-zero-cost dissemination of information gives rise to a phenomenon that's called the long tail phenomenon. Let's examine what this is. Now imagine a graph, where on X-axis we've taken the items in the catalog. Remember, items might be books, or music, or video, or news articles. And we've ranked these items by popularity. So the most popular items are on the left, and as they move towards the right, the items become less and less popular. What do I mean by popular? Well, I mean the number of times the item is purchased in a week. Or the number of times a movie is viewed in a week, or a month, or some, some fixed time period. Now, on the Y axis, you have the actual popularity, which in this case I've shown as the number of purchases per week, it could be number of views per week, or it could be number of, you know, plays per month, for some music, and so on. So in general you have items ranked by popularity along the X axis, and the popularity itself along the Y axis. Now when you take items you know, in a large catalog. And you rank them, and you plot them on this curve you get a curve that looks like this. You can see that the score, you know, has a very steep fall initially. the, the, you know, you have a really, really, a few really, really popular items. And then as you move towards the right as the, you know, as the item rank becomes greater the popularity falls off very steeply. But at a certain point, you can see that this popularity stops, you know, falls off less and less deeply. And, you know, it quite reaches the X axis. The interesting thing here, is that there is a cut off point. The you know, items that are less popular than this cut off point. You know, might be purchased perhaps just once a week. Or maybe once a month. If you're a physical retailer like a Wal-Mart, it's not economic to stock this item, because the rent cost of stocking the item is more than you make, when you sell the item. And therefore a retailer, any right thinking retailer doesn't stock items that are unpopular. The, you know, they only stock the, the head of the distribution. So there's this cutoff point that I show on this graph here and items that are more popular then this, the, the more popular items are available at a retail store. But the less popular items, the items that are to the right of the cut off point, are not available at any retail store. They're only available online. Now this phenomenon applies to books, to music, to movie, to videos to news articles for example, there are only so many news articles in newspaper, but when you go online you can see the rest of the news articles are less popular, news articles that are off to the right. The piece of the curve, that is to the the piece of the curve here that is to the right of this dividing line, is called the long tail. These are the items that are available only online. The interesting thing is, the, is this area under the curve here. And you can see the area under the curve here is quite significant. In fact, in some cases the area under the curve on the right is about as large, or could be even larger than the area of the curve, under the curve on the, on the left. So you have all these items that could never be found in a physical store, but that can be only found online. But there are so many of them. That it's very hard for any user to find all these items. Right, so when you have the seed of abundance and you have so many items and many of them are really found online. How, you know, how do you introduce a user to all these new items they may have not otherwise find? When you have more choice like this, when you have these millions and millions of items that are only available online, you need a better way for the user to find all these items. The user doesn't even know where to start looking, and that's where recommendation engines come in. So recommendation engines work in the case of many, many kinds of items books, music, movies, news articles. Interestingly, they even work in the case of people. For example, when you go to Facebook, or LinkedIn, or Twitter, there are so many people that you don't know who to follow or who to friend. And so Facebook, or LinkedIn, or Twitter make recommendations to you, on the people you could follow or friend. I like this point with interesting anecdote that shows you the power of a recommendation engine. Several years ago a book was published called Touching the Void. It's a book about mountaineering. It's very, very good book. The book came out, it didn't make much of a ripple. You know, few people bought the book. It got some decent reviews, but it never became a bestseller. And then a few years after Touching the Void, a new book was published on mountaineering called Into Thin Air. Now Into Thin Air picked up traction, and lots of people started buying Into Thin Air. Amazon noticed that a few of people who bought Into Thin Air, had also bought Touching the Void. So they started recommending Touching the Void, to people who bought Into Thin Air. And low and behold, those people started buying Touching the Void as well. The interesting point is, this made Touching the Void a bestseller. In fact, it became a bigger bestseller even than Into Thin Air, even though a few years ago, it had sank without a trace. So this example should show you the power of recommendation systems. There are these items, these sort of gems like Touching the Void you know, that people don't know because they don't know to look for them. But a good recommendation system can expose people to these hidden gems, that they wouldn't have known about otherwise. So let's look at types of recommendation systems. The simplest and the oldest kind of recommendation is editorial or hand curated. You might find a list of favorites for example when you go into your favorite neighborhood book store, you might find staff picks. Certain marked off as staff picks, right? And these are editorial triangulated, and on certain websites you'll see a list of staff favorites or a list of essential items. These are essentially built by hand. And another place where you'll see these editorial recommendations is often on the homepages of websites. For example if you go to the the homepage of most popular websites including product websites you'll see editorial picks. These are products that have been picked by the editorial staff to feature on the home page. The drawback with editorial on hand curated recommendation, is that the, it's done entirely by you know, by the staff of the website, and there's no input from the users of the site. So, when you go beyond editorial recommendations, the next simple thing that you can do is simple aggregates. On many websites, you'll see lists of top ten, or most popular, or most recent for example, if you go to YouTube you can see the most popular videos, for instance, right. So these are simple aggregates which sort of take into account user activity to make recommendations to other users. But these recommendations don't depend on the user they only depend on, you know, the, the aggregate activity of a lot of other users. The third and most interesting kind of recommendation to us is recommendations that are tailored to individual users. Right for example, book recommendations tailored to your taste, or movie recommendations based on the movies that you watched previously. Or music recommendation based on your music interests. And this is our focus here recommendations that are tailored to individual users. So lets look at a formal model. Let C be a set of customers and S a set of items. We will create a function called the utility function or a utility matrix. The utility function is a function that looks at every pair of customer and item, and maps it to a rating. Okay. R in this case is a set of ratings. And for example R could be a star rating from one star to five star. R could be a number between zero and ten. In general, R is a totally ordered set so that, you know, a lower value indicates that a user liked the product less, and a higher value indicates that a user liked the product more. Let's look at an example of a utility matrix. On the top we have four movies here. Avatar, Lord of the Rings, Matrix, and Pirates of the Caribbean. And down here we have four users Alice, Bob, Carol, and David. And the utility matrix gives you rating for certain movies and certain users. For example Alice has rated Avatar and Matrix, but not Lord of the Rings or Pirates of the Caribbean. Whereas Carol has rated you know, has rated the same two movies. Bob has rated Lord of the Rings and Pirates but hasn't rated Avatar or, or Matrix. Now it could be that these users have not scene these movies or it could be that they've seen the movies, but not bother to rate them. So in general, usually a matrix like this is going to be sparse. You know, most of the users haven't seen most of the movies. And there are going to be values in some of the you know, some of the locations. The problem in recommendation systems is to figure out these unknown values. For example you've seen that Alice is rated Avatar and Matrix, but hasn't rated Lord Of The Rings. So the question is can we figure out what Alice's rating for Lord Of The Rings will be, based on her other ratings? Can you figure out whether she likes Pirates or not? Right. So this is the key problem for recommended systems. Once we find out for each user certain movies that they would have rated highly, or the system thinks they might have rated highly, then we can recommend those movies to those users. So there are three key problems in the space of recommended systems. The first is gathering the known ratings in the ratings in the matrix. Now in the previous slide when you look at the utility matrix it was already filled in with certain values. But how do you get, gather those values in the first place? So that's the key that's the first problem that you need to tackle. The second problem is to extrapolate unknown ratings from known ratings. But we're mainly interested in the high unknown ratings. We are interested in those ratings where a user would've given a high rating to a movie.