Subtitles section Play video Print subtitles Rachel and I started fast.ai with the idea of making neural networks uncool again. It is a grand plan, to be sure, because they are apparently terribly cool. But really there are some things that we want to see improve and that's why we're doing this course. We're actually not making any money out of this course. We're donating our fees both to the diversity fellowships that we're running and also to the Fred Hollows Foundation. I would like to briefly give a quick pitch to the Fred Hollows Foundation (for those of you who aren't aware of it), because as you know, deep-learning is fantastic for computer vision; it's basically allowing computers to see for the first time. What you might not realize is that there are somethig like 3 or 4 million people in the world who can't see because they have something called "cataract blindness". Cataract blindness can be cured for $25 per eye and actually the group of people who got that price down from thousands of dollars to $25 was Fred Hollows, who was the Australian of the Year some years ago. He's passed away now, but his legacy is in this foundation where if you donate $25, you are giving somebody their sight back. So, as you learn to teach computers how to see we are also, Rachel and I, are also donating our fees from this to helping humans to see. So we think that's a nice little touch. So we're doing this both to help the Fred Hollows Foundation but more importantly to help something we care a lot about, which is making deep-learning more accessible. It is currently terribly exclusive. As I'm sure you've noticed, resources for teaching it tend to be quite mathematically intensive -- they really seem to be focused on a certain kind of ivory-tower-type audience, so we're trying to create training and examples which are for non-machine-learning and math experts, dealing with small data sets, giving raw models applications you can develop quickly. Today we're going to create a real useful piece of deep-learning code in seven lines of code. We want to get to the point where it is easy for domain experts to work with deep-learning. There are a lot of domain experts here -- whether you're working with getting satellites in the air, or whether you're working with analyzing the results of chemical studies, or whether you're analyzing fraud at a bank -- all those people are here in this audience. You are domain experts that we want to enable to use deep-learning. At this stage, the audience for this course is coders because that's as far as we think we can get at this point. We don't need you to be a math expert, but we do need you to be coders. I know that all of you have been told of that prerequisite. We do hope that with your help we can get to the point where non-coders will also be able to participate in it. The reason why we care about this is that there are problems like improving agricultural yields in the developing world, or making medical diagnostics accessible to folks that don't have them or so forth. These are things that can be solved with deep learning. But they are not going to be solved by people who are at these kind of more ivory tower firms on the whole because they are not really that familiar with these problems. The people who are familiar with these problems are the people who work with them every day. So for example, I've had a lot to do with these kinds of people at the World Economic Forum, I know people who are trying to help cure TB and malaria, I know people who are trying to help with agricultural issues in the developing world and so forth. These are all people who want to be using deep-learning for things like analyzing crop imagery from satellites, or my most recent start-up, which was analyzing radiological studies using deep-learning to deal with things like the fact that in the entire continent of Africa there are only seven pediatric radiologists. So most kids in Africa, in fact in most countries no kids have access to any radiologists and have no access to any kind of modern image-based medical diagnostics. So these are the reasons that we're creating and running this course. We hope that the kind of feel with this community is going to be very different than the feel that with deep-learning communities before, that have been all about "Let's trim 0.01% off this academic benchmark." This is going to be all about "Let's do shit that matters to people as quickly as possible." [Time: 5 minute mark] Sometimes to do that we're going to have to push the state-of-the-art of the research. And where that happens, we won't be afraid to show you the state-of-the-art of the research. The idea is that by the end of Part 1 of this, you will be able to use all of the current best practices in the most important deep-learning applications. If you stick around for Part 2, you'll be at the cutting edge of research in most of the most important research areas. So, we are not dumbing this down; we're just re-focusing it. The reason why we're excited about this is that we have now the three pieces of this universal learning machine. We now have the three critical pieces -- an infinitely flexible function, all-purpose parameter fitting, which is fast and scalable. The neural network is the function. We are going to learn exactly how neural networks work. But the important thing about a neural network is that they are universal approximation machines. There's a mathematical proof, the Universal Approximation Theorem, that we're going to learn all about which tells us that this kind of mathematical function is capable of handling any kind of problem we can throw at it. Whether that mathematical function is "How do I translate English into Hungarian", or whether that mathematical function is "How do I recognize pictures of cats", or whether that mathematical function is "How do I identify unhealthy crops". It can handle any of these things. So with that mathematic function, then the second thing you need is some way to fit the parameters of that function to your particular need. And there's a very simple way to do that, called "gradient descent" and in particular, something called "backwards propogation" or "back-prop" which we will learn all about in this lesson and the next lesson. The important thing is though that these two pieces together allow us to start with a function that is in theory capable of doing everything and turn it into a function that is in practice capable of doing whatever you want to do, as long as you have data that shows examples of what you want to do. The third piece, which has been missing until very recently, is being able to do this in a way that actually works with the amount of data that you have in the time you have available. And this has all changed thanks particularly to GPUs. So GPUs are Graphics Processing Units, also called "video cards" (that's kind of an older term now), also called "graphics cards". And these are devices inside your computer which were originally designed to play computer games. So its kind of like when you're looking at this alien from the left-hand side and there's light coming from above, what pixel color do I need for each place. That's basically a whole bunch of linear algebra operations, a whole bunch of matrix products. It turns out that those are the same operations we need for deep-learning. So because of the massive amount of money in the gaming industry that were thrown at this problem, we now have incredibly cheap, incredibly powerful cards for figuring out what aliens look like. And we can now use these, therefore, to figure out medical diagnostics in Africa. So, it's a nice, handy little side-effect. GPUs are in all of your computers, but not all of your computers are suitable for deep-learning. And the reason is that programming a GPU to do deep-learning really requires a particular kind of GPU, and in practice at the moment, it really requires a GPU from Nvidia, because Nvidia GPUs support a kind of programming called CUDA (which we will be learning about). There are other GPUs that do support deep-learning, but they're a bit of a pain, they're not very widely used. And so one of the things that we're going to be doing is making sure that all of you guys have access to an Nvidia GPU. The good news is that in the last month (I think) Amazon has made available good-quality Nvidia GPUs for everybody for the first time. They call them very excitingly their P2 instances. So I've spent the last month making sure that it's really easy to use these new P2 instances. I've given you all access to a script to do that. Unfortunately, we're still at the point where they don't trust people to use these correctly, so you have to ask permission to use these P2 instances. [Time: 10 minute mark] The Data Institute folks, for anybody who does not have an AWS P2 instance or their own GPU server, they are going to collect all of your AWS IDs, and they have a contact at Amazon who will go through and get them all approved. They haven't made any promises, they've just said they will do what they can. They are aware of how urgent that is, so if you email your AWS ID to Mindy, she will get that organized. And we'll come back and look at AWS in more detail very shortly. The other thing that I have done is on the wiki I have added some information about getting set up, Installation. There is actually quite an interesting option called OVH. I'm sure by the time that this is a MOOC there will be a lot more, but this is the only company I've come across who will give you a by-the-month server with decent deep-learning graphics cards on it, and it's only $200. To give you a sense of how crazily cheap that is, if you go to their page for GPU servers, you'll see that this GTX970 is $195 per month and their next cheapest is $2000 a month. It just so happens that this GTX970 is ridiculously cheap for how good it is at deep-learning. The reason is that deep-learning uses single-precision arithmetic -- it uses less accurate arithmetic. These higher-end cards are designed for things like fluid simulations, tracking nuclear bombs and stuff like that, that require double-precision arithmetic. So it turns out these GTX970s are only good for two things, games and deep-learning. So the fact that you can get one of these things which has got two GTX970s in it is a really good deal. So one of the things you might consider doing in your team is maybe sharing the cost of one of these things. $200 per month is pretty good compared to worrying about starting and stopping your 90 cent per hour AWS instance, particularly if AWS takes a while to say yes. How many of you people have used AWS before? Maybe a third or a half. AWS is Amazon Web Services. I'm sure most of you, if not all of you, have heard of it. It's basically Amazon making their entire back-end infrastructure available to everybody else to use. Rather than calling it a server, you get something they call an instance. You can think of it as basically being the same thing. It's a little computer that you get to use. In fact, not necessarily little.