Placeholder Image

Subtitles section Play video

  • So, John-Green-Bot, you know when you let me use your computer the other day?

  • Well, I went on YouTube and it was like seeing a completely different website. There were

  • videos about restoring old VCRs and different kinds of cassette tapes, and ads for motor

  • oil?!

  • John-Green-bot: Yes, Jabril! I love learning about other machines.

  • Jabril: Okay, but do you even know what humans are watching these days? What about those

  • Boston Dynamics videos?

  • John-Green-bot: No. The humans in those videos are so mean to the robots! What about Epic

  • Computation Battles of History, or MKB-AI, or Robot Appétit?

  • Jabril: …… what?

  • INTRO

  • Hi, I'm Jabril and welcome to Crash Course AI! Recommender systems are a type of AI that

  • try to understand our brains and make useful recommendations to us.

  • This kind of AI can guide the things we watch by recommending YouTube videos or shows on

  • Netflix for example.

  • On Amazon, it's recommending items to buy, when I search on Google, it's recommending

  • relevant and interesting links. And everywhere online, advertisement servers are trying to

  • recommend products and services.

  • Recommender systems combine supervised learning and unsupervised learning techniques to learn

  • about us.

  • And because we're so complicated, recommending stuff to us is a tough problem that can produce

  • lots of unexpected results.

  • Maybe we get caught in an online bubble and only see tweets from our friends and people

  • who think like us. Maybe we miss a new TV show because streaming sites don't think

  • we'd like it. Or maybe that creepy thing happens where you're talking to your friends

  • about supercomputers and then every single ad you see for the next day is for supercomputers?!?

  • AI that make recommendations can really change what version of the internet we all see. But

  • to understand the benefits and drawbacks of these algorithms, we have to understand where

  • they get their data and how they work.

  • As an example, let's focus on an algorithm that could recommend YouTube videos. Because

  • The Algorithmis a really big deal if YouTube is your job, and everyone's talking

  • about the mysterious changes behind the algorithm anyway.

  • Three common approaches are content-based recommendation, social recommendation, and

  • personalized recommendations.

  • Content-based recommendations look at the content of the videos, not the audience.

  • Like, for example, our algorithm may decide to recommend more recent videos, or videos

  • that are made by someone on a list ofquality creators.” But this means someone has to

  • decide whoquality creatorsare, or program an AI that tries to predict creator

  • quality.

  • On the other hand, social recommendations pay attention to the audience.

  • YouTube is on the internet so we can use social ratings such aslikesorviews

  • orwatch timeto decide what people are watching and should be recommended. But

  • not everybody likes the same stuff, so maybe pure popularity isn't the way to go.

  • Different people have different preferences, so our AI can incorporate that with personalized

  • recommendations.

  • If you like this Crash Course video, maybe we'd recommend other Crash Course videos

  • or videos from my channel. But the problem with personalized recommendations is that

  • it might be difficult to stumble onto new interesting stuff.

  • So, to get the best of all worlds, recommender systems generally use collaborative filtering,

  • which combines all three of these recommenders.

  • When we see a recommendation on YouTube, it could be because that video is similar to

  • other videos that we've watched and liked and other people who have similar tastes watched

  • and liked that video. Or (especially if you're new to Youtube) that video might be recommended

  • because it's popular and lots of people are watching and liking it.

  • Collaborative filtering combines several of the techniques we've already talked about

  • in Crash Course AI. It uses unsupervised learning to find similar people or content, and it

  • tries to use data from those things to predict how we would feel about something we haven't

  • even seen yet.

  • To see how collaborative filtering works, let's use a simple example.

  • In this table, YouTube channels are represented as columns. So, here, one column represents

  • CrashCourse, one is Jabrils, one is The Best of BattleBots, one is The Art Assignment,

  • and so on.

  • Specific users that watch YouTube videos are represented as rows. So this row is John-Green-bot,

  • this one is me, these two are a couple random folks, this one is our producer Brandon, and

  • so on.

  • Each cell in the table corresponds to whether the user subscribes to a specific channel

  • or not. 1 means they've watched at least one video and subscribed, 0 means they've

  • watched at least one video and didn't subscribe, and the cell is empty if they haven't seen

  • any videos.

  • If we look at John-Green-bot's row, he subscribes to Crash Course and Jabrils, so those cells

  • have a 1. He saw The Best of Battlebots and did not subscribe, because of all the robot-on-robot

  • violence, so that's a 0. And he's never seen The Art Assignment so there's no information

  • in that cell.

  • To recommend new channels for John-Green-bot, our collaborative filtering AI needs to predict

  • how likely he is to subscribe to a channel he's never seen before. In this case, let's

  • see if The Art Assignment ends up in his recommendations.

  • To make a prediction, the algorithm needs to look at which other people have subscribed

  • to the Art Assignment. And because YouTube tastes are very personal, instead of looking

  • at all other users, our algorithm will focus on finding the users who are most similar

  • to John-Green-Bot.

  • Finding similar things is a classic unsupervised learning problem. Our AI can look at all the

  • rows, cluster together similar users, and then pick some of those that are most similar

  • to John-Green-Bot, and who have seen The Art Assignment.

  • Let's just say there are 1000 of these specific users, but there are other clusters with thousands

  • of users too that these recommender systems take into consideration.

  • Now, we have a classic supervised learning problem: training an AI to make predictions

  • based on past examples. In this case, we're training an AI to predict a 1 or 0 (subscribe

  • or not) for John-Green-bot based on other users.

  • We can re-adjust the results so that ratings from the cluster of 1000 most similar users

  • are given more weight in the final prediction, compared to those other clusters. And after

  • the predictions are sorted, our AI does predict that John-Green-bot would subscribe

  • to The Art Assignment, so it gets recommended to himalong with some other new channels.

  • Recommender systems that use collaborative filtering AI can take in lots of different

  • data, not just a 1 or a 0, for whether a user subscribed to a YouTube channel or bought

  • a product. A movie rating site might use a one-to-five star rating system. Or a social

  • media AI could keep track of the number of milliseconds a user dwells on a post.

  • Regardless, the basic strategy is the same: use known information from users to predict

  • preferences. And this can get complicated on big websites that gather lots of user information

  • using a combination of different algorithms.

  • The real world is full of a lot of data and there are three key problems that can lead

  • to recommender systems making small or big mistakes.

  • First, datasets that recommender system AIs get are usually very sparse. Most people don't

  • watch most shows or videos -- there just isn't enough time! And even fewer people give social

  • ratings such aslikes.”

  • Doing any kind of analysis with sparse datasets is very computationally intense, which gets

  • expensive, which means some companies are willing to lose some accuracy to reduce costs.

  • Second, there's the cold start problem. When we go on a website for the first time,

  • for example, the AI doesn't know enough about us to provide good personalized recommendations.

  • And third, even if an AI makes statistically likely predictions, that doesn't mean those

  • recommendations are actually useful to us.

  • Online ads run into this failure a lot, where we'll be shown ads for sites we recently

  • visited, or something we just bought. Sure, that's probably something I'm interested

  • in, but I could've figured that out without a recommender system.

  • In a potentially more harmful way, recommender systems don't understand important social

  • context, sostatistically likelyrecommendations can be worrying.

  • Recommendations may stereotype users in a socially uncomfortable way.

  • Like, for example, an AI might assume that because John-Green-Bot is a robot, he really

  • wants to watch WALL-E and Robocop. Just because he's a robot doesn't mean he wants to

  • watch robot stuff.

  • Or recommendations might be inappropriate for certain users, like recommending a video

  • that a parent would consider too violent to their children after they had watched a bunch

  • of NERF War videos.

  • And, on social media, recommendations can trap us in ideological echo chambers, where

  • we tend to only see the opinions from people that agree with us, which can skew our knowledge

  • about the world.

  • This idea that we all see slightly different versions of the internet, and data is constantly

  • being collected about us, can be a little concerning. But understanding how recommender

  • systems work, can help us live more knowledgeable lives, and coexist with AI.

  • When we don't want data added to a recommender system's model of us, we can use a private

  • or incognito browser window and not log into sites. If we open a news homepage this way,

  • we might see what the average human (or robot!) is being recommended.

  • Of course, incognito browsers don't mean total privacy, but this strategy prevents

  • sites from connecting data -- like, for example, my Twitter account with my searches for tiny

  • polo shirts on Google (because I needed to get John-Green-bot a birthday present).

  • Plus, since we spend so much time online, we might want to make the most of it with

  • really personalized recommendations. Soseriously… “like, comment, and subscribe

  • to your favorite creators because as we leave ratings, reviews, and other traces of online

  • activities, recommender systems can learn better models.

  • Recommender systems are a part of the internet as we know it, whether we like it or not.

  • And as AI becomes a bigger part of our lives,these kinds of recommendations will be too. So it's

  • on us to be aware of this technology, so that we know what kind of world we're living

  • in, and the ways AI might influence us every single day.

  • And if you're here to learn how to build recommender systems, my advice would be to

  • think explicitly about the trade-offs that are involved. Deciding how to define the clusters

  • of users or items, can create more or less personalized spaces.

  • In our next episode, we'll work together on some code to build a recommender system,

  • and we'll get some hands-on experience with weighing some of these trade-offs. I'll

  • see ya then.

  • Speaking of recommendations, you should check out Sound Field. Sound Field is a new music

  • education show from PBS Digital Studios that explores the music theory, production, history

  • and culture behind our favorite songs and musical styles. Hosted by two supremely talented

  • musicians, ArthurLABuckner and Nahre Sol, every episode is one part video essay

  • and one part musical performance.

  • So go subscribe to Sound Field! Link in the description below.

  • Crash Course AI is produced in association with PBS Digital Studios! If you want to help

  • keep all Crash Course free for everybody, forever, you can join our community on Patreon.

  • And if you want to check out Sound Field, click the link below.

So, John-Green-Bot, you know when you let me use your computer the other day?

Subtitles and vocabulary

Click the word to look it up Click the word to find further inforamtion about it