Subtitles section Play video Print subtitles ROSSI LUO: Good afternoon. Welcome to Brown Biostatistics Seminar. And I'm Rossi Luo, faculty host for today's event. And for those of you new to our departmental seminar, the format is usually that the presentation followed by a question and answer session. And because of the size of crowd today, we are going to also use this red box thing to capture your questions and for videotaping and also make sure your questions are heard. And today I'm very pleased to introduce Professor Yann LeCun. Professor LeCun is a director of Facebook AI Research, also known as FAIR. And he is also senior professor of computer science, neuroscience, and electronic computer engineering at New York University. He's also the founding director of NYU Center for Data Science. Before joining NYU, he had a research department for industry, including AT&T and NEC. Professor LeCun has made extraordinary research contributions in machine learning, computer vision, mobile robotics, computational neuroscience. Among this, he's a pioneer in developing convolutional neural networks. And he is also a founding father of convolutional nets. And these works contributed to say the creation of new an exploding field in machine learning called deep learning, which is now called artificial intelligence tool for various range of applications from image to natural text processing. And his research on contributions has earned him many honors and awards including the election to the US National Academy of Engineering. Today he will give a seminar titled, How Can Machines Learn as Efficiently as Animals and Humans. I understand some of you actually told me you drove from Boston or many places are very far. So without further ado, let's welcome Professor Yann LeCun for his talk. [APPLAUSE] YANN LECUN: Thank you very much. It's a pleasure to be here. A game I play now occasionally when I give a talk here is I count how many former colleagues from AT&T are in the room. I count at least two. Chris Rose here, Michael Litman. Maybe that's it. That's pretty good, two. Right. So, how can machines learn as efficiently as animals and humans? A have a terrible confession to make. AI systems today suck. [LAUGHTER] Here it is in a slightly less vernacular form. Recently, I gave a talk at a conference in Columbia called the Compositional and Cognitive Neuroscience Conference. It was the first edition. And there was a keynote. And before me, Josh Tenenbaum give a keynote where he said this. All of these AI systems that we see now, none of them are real AI. And what he means by this is that none of them actually learn stuff that are as complicated as what humans can learn. But also learn stuff as efficiently as what animals seem to learn them. So we don't have robots that are nearly as agile as a cat for example. You know, we have machines that can play golf better than any humans. But that's kind of not quite the same. And so that tells us there are major pieces of learning that we haven't figured out. That animals are able to do that, we don't do-- we can't do with our machines. And so, I'm sort of jumping ahead here and telling you the punch line in advance, which is that we need a new paradigm for learning, or a new way of formulating that has old paradigms that will allow machines to learn how the world works the way animals and humans do that. So the current paradigm of learning is basically supervised learning. So all the applications of machine learning, AI, deep learning, all the stuff you see the actual real world applications, most of them use supervised learning. There's a tiny number of them that use reinforcement learning. Most of them use some form of supervised learning. And you know, supervised learning, we all-- I'm sure most of you in the room know what it is. You want to build a machine that classifies cars from airplanes. You show an image of a car. If a machine says car, you do nothing. If it says airplane, you adjust the knobs on the machine so that the output gets closer to what you want. And then you show an example of an airplane. And you do the same. And then you keep showing images of airplanes and cars, millions of them, thousands of them. You adjust the knobs a little bit every time. And eventually, the knobs settle on a configuration, if you're lucky enough, that will distinguish every car from every airplane, including the ones that the machine has never seen before. That's called a generalization ability. And what deepening has brought to the table there, unsupervised learning, is the ability to build those machines more or less numerically with very little sort of human input in how the machine needs to be built, except in very general terms. So the limitation of this is that you had to have lots of data that has been labeled by people. And to get a machine to distinguish cars from airplanes, you need to share with thousands of examples. And it's not the case that babies or animals need thousands of examples of each category to be able to recognize. Now, I should say that even with supervised learning, you could do something called transfer learning, where you train a machine to recognize lots of different objects. And then if you want to add a new object category, you can just retrain with very few samples. And generally it works. And so what that says, what that tells you is that when you train a machine, you kind of figure out a way to represent the world that is independent of the task somehow, even though you train it for a particular task. So what did deep learning bring to the table? Deep learning brought to the table the ability to basically train those machines without having to hand craft too many modules of it. The traditional way of doing pattern recognition is you take an image, and you design a feature extractor that turns the image into a list of numbers that can be digested by a learning algorithm, regardless of what your favorite learning algorithm is, linear classifiers, [INAUDIBLE] machines, kernel machines, trees, whatever you want, or neural nets. But you have to preprocess it in a digestible way. And what deep learning has allowed us to do is basically design a learning machine as a cascade of parametrised modules, each of which computes a nonlinear function parametrised by a set of coefficients, and train the whole machine end to end to do a particular task. And this kind of an old idea. People even in the 60s had the idea that this would be great to come up with learning algorithms that would train multilayer systems of this type. They didn't quite have the right framework if you want, neither the right computers for it. And so in the 80s, something came up called back propagation with neural nets that allowed us to do this. And I'm going to come to this in a minute. So the next question you can ask of course is what do you put in those boxes? And the simplest thing you can imagine as a nonlinear function, it has to be non-linear, because otherwise there's no point in stacking boxes. So the simplest thing you can imagine is take an image, think of it as a vector, essentially. Multiply it by a matrix. The coefficient of this matrix are going to be learned. And you can think of every row