Subtitles section Play video
-
Let's turn to the task of actually implementing collaborative filtering and
-
look at the complexity of the implementation.
-
The most expensive step in implementing collaborative filtering is the,
-
the step of finding k most similar users.
-
Or the k most similar items.
-
The, the neighborhood of an item or a user.
-
Remember what we had to do to find the k most similar item was to take the item and
-
compute its similarity with respect to every other item.
-
And this required us to compute, for example, a cosine or
-
a centered cosine distance, between the item of interest and every other item.
-
If we actually did this at every step with how to actually look at every entry in
-
the utility matrix.
-
And so the,
-
the complexity of doing that turns out to be the size of the utility matrix U.
-
Now, since a utility matrix can be very, very large because there are many,
-
many users, millions of users and millions of items.
-
Clearly impractical to do this at one time.
-
So what we really have to do is pre compute.
-
Pre compute for every item with set up similar items.
-
Or for every user a similar set of users.
-
But doing naive pre computations can take a really long time.
-
Supposed to have n n items and we're doing item to item collaborative filtering.
-
For each item, we'd have to compute its similarity to the other item.
-
And so the complexity of doing that is a product of the number of the items and
-
the size of the utility matrix.
-
[INAUDIBLE] Now, since a utility matrix can be very large, and
-
the number of items can be very large.
-
The product of these two is clearly a very large number.
-
And so even a naive pre-computation, can, can be too expensive to do, in practice.
-
So how do you deal with this?
-
Now.
-
In previous lectures, we looked at a way of doing this, and that technique,
-
those are the techniques for near-neighbor search in high dimensions.
-
For example, locality of sensitive, locality-sensitive hashing.
-
We can use something like LSH, to, to take an item, and
-
quickly find, a set of, near neighbors to that item, or
-
take a user, and find a set of near neighbors to that user.
-
And do that in advance.
-
In practice, you might do something like this you know,
-
once in a day for, for example or once every few hours, we can also use
-
clustering to to group users and items into,
-
into smaller clusters and thereby speed up the process by restricting the search,
-
to within a cluster, as opposed to the entire setup items as users.
-
And finally, we could use a set of techniques called dimensionality reduction
-
methods which will cover in an upcoming lecture.
-
What are the advantages and disadvantages of collaborative filtering?
-
The biggest advantage of collaborative filtering is that it works for
-
any kind of item.
-
It doesn't matter whether the item is books or
-
music or videos or news articles or people.
-
Collaborative filtering works on any kind of item without requiring and
-
feature selection which is actually the biggest advantage of
-
collaborative filtering because it turns out to be tough problem to find the right
-
sort of features so something as complicated as movie.
-
Or a piece of news and
-
this it sort of also explains the huge popularity of collaborative filtering.
-
Because it sort of obviates this need for
-
feature selection for complex things such as images and movies and music and so on.
-
That said, collaborative filtering has a number of disadvantages.
-
The first of these, is cold start.
-
For Collaborative Filtering to work you need enough users in the system,
-
who have rated enough items.
-
Remember we need to, given an item you're to find a set of other similar items.
-
Or given a user, you're to find the set of other similar users.
-
But if there are not enough users in the system, it's hard to find a match.
-
The second problem is, is one of sparsity.
-
Even when we do have enough users in the.
-
In the in the system.
-
Because there are so many items.
-
Typically the millions of users and millions and millions of items.
-
Most users have not taken most items.
-
And therefore the user ratings matrix is very, very sparse indeed.
-
And it's hard to find, you know,
-
a user, better users have actually rated the same set of items, right?
-
So remember, supposed they wanted to predict the, rating for user, x and
-
an item i, I need to find other users who also rated item i.
-
Really hard to find such set of users because of sparsity.
-
The next problem, is the first rater problem.
-
Suppose we add a new item to the catalog, right?
-
we, you cannot make recommend this new item to anybody because the new
-
item doesn't have any ratings.
-
And Esoteric item, for instance, that are actually you know,
-
there may be a few people who really love the item, but
-
the number of people who like the item is really small.
-
Think you have a smaller number of ratings,
-
that's how to make recommendations for those items, too.
-
And the final problem of collaborative filtering is the popularity bias.
-
Let's take a really popular movie or, or book like Harry Potter.
-
Now, lots of people tend to give high ratings to such a popular,
-
popular book or movie.
-
And Collaborative Filtering will therefore tend to recommend a popular book or movie.
-
to, to almost everyone in the system.
-
Regardless of whether they'd actually like it or not.
-
Just because the neighborhood of any item will include popular items.
-
At Amazon we used to call this problem the Harry Potter effect.
-
And the Harry Potter effect need not be a bad thing.
-
Recommending popular items generally works out well.
-
But it's you know, the problem is if every item that's recommended is
-
a popular item the popular items can completely cloud out
-
the unique recommendations that can be made for a specific user.
-
And you are to take special steps to avoid the popularity bias from washing out
-
the specific recommendations.
-
Now that you've seen some of the difficulties with collaborative filtering,
-
we can design hybrid methods to work on those difficulties.
-
For example, we can add content based methods to collaborative filtering.
-
We can add item profiles to deal with the new item problem and
-
make recommendations of new items to users.
-
Or we might you know, take new users and use demographic information about new
-
users to build synthetic profiles for them and that deal with the new user problem.
-
Another approach is to implement two or more different recommender systems and
-
combine their predictions perhaps using the linear model.
-
For example, you could use a global baseline recommender and
-
combine that with collaborative filtering.
-
And let's see how to do this.
-
So here's a problem.
-
We take the estimate of Joe's rating for the movie The Sixth Sense.
-
The problem is that Joe has not rated any movies similar to The Sixth Sense
-
according to our measure of similarity.
-
And so, using, the collaborative filtering formula that saw in the previous lights,
-
we can make no prediction for Joe's rating for the movie, The Sixth Sense.
-
This is a problem that arises from the sparsity of the rating matrix.
-
So here's an approach called the Global Baseline approach, and it's very,
-
very simple.
-
Suppose in our, in our system, the mean movie rating is 3.7 stars.
-
3.7 is the average rating for, across all movies and all users.
-
And we observe that The Sixth Sense, the ra, the average rating for
-
Sixth Sense is 0.5 stars above the mean movie rating.
-
People like The Sixth Sense, on average data, you know,
-
more than the average movie.
-
You also notice that Joe rates 0.2 stars below the average rating for,
-
for other users.
-
Right?
-
This is not, remember this is not Joe's rating for The Sixth Sense or
-
for any specific movie.
-
But if we take the average of Joe's rating, Joe Turns will be a tough rater.
-
And his average rating is 3.5 stars as opposed to the mean rating of 3.7 stars.
-
And so Joe's rating i, i, Joe's rating in general are 0.2 stars below average.
-
Now we can use these, three numbers to come up with the baseline estimate for,
-
the user Joe and the movie Sixth Sense.
-
We started a mean rating with just 3.7.
-
Notice that The Sixth Sense is 0.5 stars above average.
-
And then we subtract the 0.2
-
because Joe's rating is on average 0.2 stars below the average rating.
-
And you predict that Joe will give the Sixth Sense four stars.
-
Notice that in this case we have not used any
-
specific information about the movies Joe has rated.
-
We don't need Joe to have rated any movies similar to
-
The Sixth Sense in order to make this prediction.
-
All we need is to have enough users who have actually rated The Sixth Sense.
-
And so, so that we can actually compute an average rating for for the you know,
-
for The Sixth Sense.
-
Now what we'd like to do is actually combine this global baseline rating
-
with collaborative filtering.
-
So let, let's look at an example.
-
So the Global Baseline estimated that Joe will give The Sixth Sense 4 stars.
-
Now suppose we use a collaborative filtering on nearest neighbour approach
-
and we actually find out that Joe didn't like the movie Signs and
-
Signs actually happens to be a movie that's is very similar to The Sixth Sense,
-
exactly by the same director.
-
And so and lets say the similarity between Signs and
-
Sixth Sense is is one point of we find out Joe didn't like Signs and
-
in fact Joe rated Signs one star below his average rating for all movies.
-
So now we can combine the Global Baseline estimate and
-
the collaborative filtering refinement and come up with the final estimate.
-
So, since the Global Baseline estimates that Joe will rate The Sixth Sense four
-
stars whereas the collaborative filtering gives it
-
a negative one below his average rating.
-
So we can, we can just add those two ratings.
-
And predict that Joe will grade The Sixth Sense three stars.
-
So notice that this approach actually takes a linear combination of,
-
two independent classifiers,
-
a global baseline classifier, and a collaborative filtering classifier, and
-
takes the sum of those, you know, takes the sum of those predictions.
-
And if you wanted we could rated these predictions in different ways as well.
-
In this case, we rated both of those equally.
-
So finally, here is the the formula that we can use to implement the,
-
the combination in the global base line and collaborative filtering.
-
Let's define SIJ to be the similarity of items I and J.
-
And given and item I, we're going to find it's K nearest neighbors and we'll
-
only going to find the K nearest neighbors who have also been rated by user X.
-
And our goal is to estimate the rating for user X, and item i.
-
And, about here, and, given the, the simple formula that we had, which was just
-
a, a weighted average dating, for all the items in the neighborhood N i x.
-
Now we're going to add in the global baseline idea.
-
And here's what the new formula looks like.
-
We're going to take estimate the rating r x i
-
to be the sum of a baseline rating and the collaborative filtering refinement.
-
b b x i here is the baseline estimate.
-
And the baseline estimate for user x and
-
item i is itself the sum of three components.
-
mu, which is the overall mean movie rating in the system.
-
b x, which is the rating deviation of user x, which is just the average rating
-
that user x gives across all movies that he has rated, minus mu.
-
And b i is similarly the rating deviation of movie i.
-
And so if you add those three up, you get bxi which is a baseline rating for
-
user x and item i.
-
And then we're going to add in the collaborative filtering piece,
-
which is the same as the formula we had before, with the weighted average.
-
Rated by similarity of item i and item j in the neighborhood.
-
however, and from using rxj, which is the rating of user x for item J.
-
We're just going to subtract out the baseline piece and when you look at
-
the the, the, the deviation of the rating for the, for the item from the baseline.
-
Since you've already added the baseline piece we don't want to double count it.
-
In the second piece there as well.
-
So we subtracted out the the, the,
-
the baseline ratings from the collaborative filtering piece.
-
And this gives us the final formula that combines collaborative filtering and
-
the baseline approach.