Subtitles section Play video Print subtitles MALE SPEAKER: Welcome, everybody, to one more Authors at Google Talk. Today, our guest speaker is Pedro Domingos, whose new book is called "The Master Algorithm." We have it here and you can buy copies outside. So one definition of machine learning is "the automation of discovery." Our guest, Pedro Domingos, is at the very forefront of the search for the master algorithm, a universal learner capable of deriving all knowledge, past, present and future, from data. Pedro Domingos is a professor of Computer Science and Engineering at the University of Washington. He's the co-founder of the International Machine Learning Society. Pedro received his MS in Electrical Engineering and Computer Science from IST in Lisbon, his Master's of Science and PhD in Information and Computer Science from the University of California at Irvine. He spent two years as an assistant professor at IST before joining the faculty of the University of Washington in 1999. Pedro is the author or co-author of over 200 technical publications in machine learning, data mining, and other areas. He is the winner of the SIGKDD Innovation Award, the highest honor in data science. He's an AAAI Fellow and has received the Sloan Fellowship and NSF Career Award, a Fulbright scholarship, an IBM Faculty Award, several best paper awards, and other distinctions. He's a member of the editorial board of "The Machine Learning Journal." Please join me in welcoming Pedro, today, to Google. [APPLAUSE] PEDRO DOMINGOS: Thank you. Let me start with a very simple question-- where does knowledge come from? Until very recently, it came from just three sources, number one, evolution-- that's the knowledge that's encoded in your DNA-- number two, experience-- that's the knowledge that's encoded in your neurons-- and number three, culture, which is the knowledge you acquire by talking with other people, reading books, and so on. And everything that we do, right, everything that we are basically comes from these three sources of knowledge. Now what's quite extraordinary is just, only recently, there's a fourth source of knowledge on the planet. And that's computers. There's more and more knowledge now that comes from computers, is discovered by computers. And this is as big of a change as the emergence of each of these four was. Like evolution, right, well, that's life on earth. It's the product of evolution. Experience is what distinguishes us mammals from insects. And culture is what makes humans what we are and as successful as we are. Notice, also, that each of these forms of knowledge discovery is orders of magnitude faster than the previous one and discovers orders of magnitude more knowledge. And indeed, the same thing is true of computers. Computers can discover knowledge orders of magnitude faster than any of these things that went before and that co-exist with them and orders of magnitude more knowledge in the same amount of time. In fact, Yann LeCun says that "most of the knowledge in the world in the future is going to be extracted by machines and will reside in machines." So this is a major change that, I think, is not just for us computer scientists to know about and deal with, it's actually something that everybody needs to understand. So how do computers discover new knowledge? This is, of course, the province of machine learning. And in a way, what I'm going to try to do in this talk is try to give you a sense of what machine learning is and what it does. If you're already familiar with machine learning, this will hopefully give you a different perspective on it. If you're not familiar with machine learning already, this should be quite fascinating and interesting. So there are five main paradigms in machine learning. And I will talk about each one of them in turn and then try to step back and see, what is the big picture and what is this idea of the master algorithm. The first way computers discover knowledge is by filling gaps in existing knowledge. Pretty much the same way that scientists work, right? You make observations, you hypothesize theories to explain them, and then you see where they fall short. And then you adapt them, or throw them away and try new ones, and so on. So this is one. Another one is to emulate the brain. Right? The greatest learning machine on earth is the one inside your skull, so let's reverse engineer it. Third one is to simulate evolution. Evolution, by some standards, is actually an even greater learning algorithm than your brain is, because, first of all, it made your brain. It also made your body. And it also made every other life form on Earth. So maybe that's something worth figuring out how it works and doing it with computers. Here's another one. And this is to realize that all the knowledge that you learn is necessarily uncertain. Right? When something is induced from data, you're never quite sure about it. So the way to learn is to quantify that uncertainty using probability. And then as you see more evidence, the probability of different hypotheses evolves. Right? And there's an optimal way to do this using Bayes' theorem. And that's what this approach is. Finally, the last approach, in some ways, is actually the simplest and maybe even the most intuitive. It's actually to just reason by analogy. There's a lot of evidence in psychology that humans do this all the time. You're faced with a new situation, you try to find a matching situation in your experience, and then you transfer the solution from the situation that you already know to the new situation that you're faced with. And connected with each of these approaches to learning, there is a school of thought in machine learning. So the five main ones are the Symbolists, Connectionists, Evolutionaries, Bayesians, and Analogizers. The Symbolists are the people who believe in discovering new knowledge by filling in the gaps in the knowledge that you already have. One of the things that's fascinating about machine learning is that the ideas in the algorithms come from all of these different fields. So for example, the Symbolists, they have their origins in logic, philosophy. And they're, in some sense, the most "computer-sciency" of the five tribes. The Connectionists, their origins are, of course, in neuroscience, because they're trying to take inspiration from how the brain works. The Evolutionaries, well, their origins are, of course, in evolutionary biology, in the algorithm of evolution. The Bayesians come from statistics. The Analogizers actually have influences from a lot of different fields, but probably the single most important one is psychology. So in addition to being very important for our lives, machine learning is also a fascinating thing, I think, to study, because in the process of studying machine learning, you can actually study all of these different things. Now each of these "tribes" of machine learning, if you will, has its own master algorithm, meaning its own general purpose learner that, in principle, can be used to learn anything. In fact, each of these master algorithms has a mathematical proof that says, if you give it enough data, it can learn anything. OK? For the Symbolists, the master algorithm is inverse deduction. And we'll see, in a second, what that is. For the Connectionists, it's backpropagation. For the Evolutionaries, it's genetic programming. For the Bayesians, it's probabilistic inference using Bayes' theorem. And for the Analogizers, it's kernel machines, also known as support vector machines. OK? So let's see what just the key ideas in each one of these are. So the Symbolists-- here are some of the most prominent Symbolists in the world. There's Tom Mitchell at Carnegie Mellon, Steve Muggleton in the UK, and Russ Quinlan in Australia. And their idea is actually a very interesting one. It's to think of deduction-- sorry, it's to think of learning as being the inverse of deduction. Learning is induction of knowledge. Right? Deduction is going from general rules to specific facts. Induction is the opposite. It's going from specific facts to general rules. So in some sense, one is the inverse of the other. And so maybe we can figure out how to do induction in the same way that people in mathematics figure out how to do other inverse operations. Like, for example, subtraction is the inverse of addition, or integration is the inverse of differentiation, and so forth. So as a very, very simple example, addition gives you the answer to the question, if I add 2 and 2, what do I get. Subtraction-- and the answer, of course, is 4. And this is the deepest thing I'll say in this whole talk. And subtraction, of course, gives you the answer to the inverse question, which is, what do I need to add to 2 in order to get 4, the answer, of course, being 2. Now inverse deduction works in a very similar way. So here's a simple example of deduction. You know that Socrates is human and you know that humans are mortal. And the question is, what can you infer from that. Well, of course, the answer is that, from that, you can infer that Socrates, too, is mortal. Now the inverse of this-- and that's when it becomes induction-- is, if I know that Socrates is human, what else do I need to know in order to be able to infer that he's mortal. And of course, what I need to know is that humans are mortal. And so in this way, I have just introduced a new general rule that humans are mortal. Of course, in general, I wouldn't just do it from Socrates, I would do it from Socrates and a bunch of other people. But that's the general way that this works. OK? And then once I've induced rules like this, I can now combine them in all sorts of different ways to answer questions that I may never even have thought of. And this kind of flexibility and composition is actually something that, of all the five tribes, only the Symbolists have. Now of course, these examples are in English and computers don't understand natural language yet. So what they use is something like first order logic. So these things, both the facts and the rules that are discovered, are represented in first order logic. And then questions are answered by chaining those rules, by reasoning with them. OK? But whether it's in logic or natural language, the principle is the same. OK? And as I said, of all the five paradigms, this is the one that is most like scientists at work. Right? They figure out, where are the gaps in my knowledge. Let me enunciate a general principle that will fill that gap. And then let me see what follows. Let me see if it's correct given the data. Let me see what gaps I identified and so on. And in fact, one of the most amazing applications of inverse deduction to date is actually a robot scientist. So if you look at this picture, the biologist is not the guy in the lab coat.