Placeholder Image

Subtitles section Play video

  • Rachel and I started fast.ai with the idea of making neural networks uncool again.

  • It is a grand plan, to be sure, because they are apparently terribly cool.

  • But really there are some things that we want to see improve and that's why we're doing

  • this course.

  • We're actually not making any money out of this course.

  • We're donating our fees both to the diversity fellowships that we're running and also to

  • the Fred Hollows Foundation.

  • I would like to briefly give a quick pitch to the Fred Hollows Foundation (for those

  • of you who aren't aware of it), because as you know, deep-learning is fantastic for computer

  • vision; it's basically allowing computers to see for the first time.

  • What you might not realize is that there are somethig like 3 or 4 million people in the

  • world who can't see because they have something called "cataract blindness".

  • Cataract blindness can be cured for $25 per eye and actually the group of people who got

  • that price down from thousands of dollars to $25 was Fred Hollows, who was the Australian

  • of the Year some years ago.

  • He's passed away now, but his legacy is in this foundation where if you donate $25, you

  • are giving somebody their sight back.

  • So, as you learn to teach computers how to see we are also, Rachel and I, are also donating

  • our fees from this to helping humans to see.

  • So we think that's a nice little touch.

  • So we're doing this both to help the Fred Hollows Foundation but more importantly to

  • help something we care a lot about, which is making deep-learning more accessible.

  • It is currently terribly exclusive.

  • As I'm sure you've noticed, resources for teaching it tend to be quite mathematically

  • intensive -- they really seem to be focused on a certain kind of ivory-tower-type audience,

  • so we're trying to create training and examples which are for non-machine-learning and math

  • experts, dealing with small data sets, giving raw models applications you can develop quickly.

  • Today we're going to create a real useful piece of deep-learning code in seven lines

  • of code.

  • We want to get to the point where it is easy for domain experts to work with deep-learning.

  • There are a lot of domain experts here -- whether you're working with getting satellites in

  • the air, or whether you're working with analyzing the results of chemical studies, or whether

  • you're analyzing fraud at a bank -- all those people are here in this audience.

  • You are domain experts that we want to enable to use deep-learning.

  • At this stage, the audience for this course is coders because that's as far as we think

  • we can get at this point.

  • We don't need you to be a math expert, but we do need you to be coders.

  • I know that all of you have been told of that prerequisite.

  • We do hope that with your help we can get to the point where non-coders will also be

  • able to participate in it.

  • The reason why we care about this is that there are problems like improving agricultural

  • yields in the developing world, or making medical diagnostics accessible to folks that

  • don't have them or so forth.

  • These are things that can be solved with deep learning.

  • But they are not going to be solved by people who are at these kind of more ivory tower

  • firms on the whole because they are not really that familiar with these problems.

  • The people who are familiar with these problems are the people who work with them every day.

  • So for example, I've had a lot to do with these kinds of people at the World Economic

  • Forum, I know people who are trying to help cure TB and malaria, I know people who are

  • trying to help with agricultural issues in the developing world and so forth.

  • These are all people who want to be using deep-learning for things like analyzing crop

  • imagery from satellites, or my most recent start-up, which was analyzing radiological

  • studies using deep-learning to deal with things like the fact that in the entire continent

  • of Africa there are only seven pediatric radiologists.

  • So most kids in Africa, in fact in most countries no kids have access to any radiologists and

  • have no access to any kind of modern image-based medical diagnostics.

  • So these are the reasons that we're creating and running this course.

  • We hope that the kind of feel with this community is going to be very different than the feel

  • that with deep-learning communities before, that have been all about "Let's trim 0.01%

  • off this academic benchmark."

  • This is going to be all about "Let's do shit that matters to people as quickly as possible."

  • [Time: 5 minute mark]

  • Sometimes to do that we're going to have to push the state-of-the-art of the research.

  • And where that happens, we won't be afraid to show you the state-of-the-art of the research.

  • The idea is that by the end of Part 1 of this, you will be able to use all of the current

  • best practices in the most important deep-learning applications.

  • If you stick around for Part 2, you'll be at the cutting edge of research in most of

  • the most important research areas.

  • So, we are not dumbing this down; we're just re-focusing it.

  • The reason why we're excited about this is that we have now the three pieces of this

  • universal learning machine.

  • We now have the three critical pieces -- an infinitely flexible function, all-purpose

  • parameter fitting, which is fast and scalable.

  • The neural network is the function.

  • We are going to learn exactly how neural networks work.

  • But the important thing about a neural network is that they are universal approximation machines.

  • There's a mathematical proof, the Universal Approximation Theorem, that we're going to

  • learn all about which tells us that this kind of mathematical function is capable of handling

  • any kind of problem we can throw at it.

  • Whether that mathematical function is "How do I translate English into Hungarian", or

  • whether that mathematical function is "How do I recognize pictures of cats", or whether

  • that mathematical function is "How do I identify unhealthy crops".

  • It can handle any of these things.

  • So with that mathematic function, then the second thing you need is some way to fit the

  • parameters of that function to your particular need.

  • And there's a very simple way to do that, called "gradient descent" and in particular,

  • something called "backwards propogation" or "back-prop" which we will learn all about

  • in this lesson and the next lesson.

  • The important thing is though that these two pieces together allow us to start with a function

  • that is in theory capable of doing everything and turn it into a function that is in practice

  • capable of doing whatever you want to do, as long as you have data that shows examples

  • of what you want to do.

  • The third piece, which has been missing until very recently, is being able to do this in

  • a way that actually works with the amount of data that you have in the time you have

  • available.

  • And this has all changed thanks particularly to GPUs.

  • So GPUs are Graphics Processing Units, also called "video cards" (that's kind of an older

  • term now), also called "graphics cards".

  • And these are devices inside your computer which were originally designed to play computer

  • games.

  • So its kind of like when you're looking at this alien from the left-hand side and there's

  • light coming from above, what pixel color do I need for each place.

  • That's basically a whole bunch of linear algebra operations, a whole bunch of matrix products.

  • It turns out that those are the same operations we need for deep-learning.

  • So because of the massive amount of money in the gaming industry that were thrown at

  • this problem, we now have incredibly cheap, incredibly powerful cards for figuring out

  • what aliens look like.

  • And we can now use these, therefore, to figure out medical diagnostics in Africa.

  • So, it's a nice, handy little side-effect.

  • GPUs are in all of your computers, but not all of your computers are suitable for deep-learning.

  • And the reason is that programming a GPU to do deep-learning really requires a particular

  • kind of GPU, and in practice at the moment, it really requires a GPU from Nvidia, because

  • Nvidia GPUs support a kind of programming called CUDA (which we will be learning about).

  • There are other GPUs that do support deep-learning, but they're a bit of a pain, they're not very

  • widely used.

  • And so one of the things that we're going to be doing is making sure that all of you

  • guys have access to an Nvidia GPU.

  • The good news is that in the last month (I think) Amazon has made available good-quality

  • Nvidia GPUs for everybody for the first time.

  • They call them very excitingly their P2 instances.

  • So I've spent the last month making sure that it's really easy to use these new P2 instances.

  • I've given you all access to a script to do that.

  • Unfortunately, we're still at the point where they don't trust people to use these correctly,

  • so you have to ask permission to use these P2 instances.

  • [Time: 10 minute mark]

  • The Data Institute folks, for anybody who does not have an AWS P2 instance or their

  • own GPU server, they are going to collect all of your AWS IDs, and they have a contact

  • at Amazon who will go through and get them all approved.

  • They haven't made any promises, they've just said they will do what they can.

  • They are aware of how urgent that is, so if you email your AWS ID to Mindy, she will get

  • that organized.

  • And we'll come back and look at AWS in more detail very shortly.

  • The other thing that I have done is on the wiki I have added some information about getting

  • set up, Installation.

  • There is actually quite an interesting option called OVH.

  • I'm sure by the time that this is a MOOC there will be a lot more, but this is the only company

  • I've come across who will give you a by-the-month server with decent deep-learning graphics

  • cards on it, and it's only $200.

  • To give you a sense of how crazily cheap that is, if you go to their page for GPU servers,

  • you'll see that this GTX970 is $195 per month and their next cheapest is $2000 a month.

  • It just so happens that this GTX970 is ridiculously cheap for how good it is at deep-learning.

  • The reason is that deep-learning uses single-precision arithmetic -- it uses less accurate arithmetic.

  • These higher-end cards are designed for things like fluid simulations, tracking nuclear bombs

  • and stuff like that, that require double-precision arithmetic.

  • So it turns out these GTX970s are only good for two things, games and deep-learning.

  • So the fact that you can get one of these things which has got two GTX970s in it is

  • a really good deal.

  • So one of the things you might consider doing in your team is maybe sharing the cost of

  • one of these things.

  • $200 per month is pretty good compared to worrying about starting and stopping your

  • 90 cent per hour AWS instance, particularly if AWS takes a while to say yes.

  • How many of you people have used AWS before?

  • Maybe a third or a half.

  • AWS is Amazon Web Services.

  • I'm sure most of you, if not all of you, have heard of it.

  • It's basically Amazon making their entire back-end infrastructure available to everybody

  • else to use.

  • Rather than calling it a server, you get something they call an instance.

  • You can think of it as basically being the same thing.

  • It's a little computer that you get to use.

  • In fact, not necessarily little.