Placeholder Image

Subtitles section Play video

  • - Hello?

  • Okay, it's after 12, so I want to get started.

  • So today, lecture eight, we're going to talk about

  • deep learning software.

  • This is a super exciting topic because it changes

  • a lot every year.

  • But also means it's a lot of work to give this lecture

  • 'cause it changes a lot every year.

  • But as usual, a couple administrative notes

  • before we dive into the material.

  • So as a reminder the project proposals for your

  • course projects were due on Tuesday.

  • So hopefully you all turned that in,

  • and hopefully you all have a somewhat good idea

  • of what kind of projects you want to work on

  • for the class.

  • So we're in the process of assigning TA's to projects

  • based on what the project area is

  • and the expertise of the TA's.

  • So we'll have some more information about that

  • in the next couple days I think.

  • We're also in the process of grading assignment one,

  • so stay tuned and we'll get those grades back to you

  • as soon as we can.

  • Another reminder is that assignment two has been out

  • for a while.

  • That's going to be due next week, a week from today, Thursday.

  • And again, when working on assignment two,

  • remember to stop your Google Cloud instances

  • when you're not working to try to preserve your credits.

  • And another bit of confusion, I just wanted to

  • re-emphasize is that for assignment two you really

  • only need to use GPU instances for the last notebook.

  • For all of the several notebooks it's just in Python

  • and Numpy so you don't need any GPUs for those questions.

  • So again, conserve your credits,

  • only use GPUs when you need them.

  • And the final reminder is that the midterm is coming up.

  • It's kind of hard to believe we're there already,

  • but the midterm will be in class on Tuesday, five nine.

  • So the midterm will be more theoretical.

  • It'll be sort of pen and paper working through different

  • kinds of, slightly more theoretical questions

  • to check your understanding of the material that we've

  • covered so far.

  • And I think we'll probably post at least a short sort of

  • sample of the types of questions to expect.

  • Question?

  • [student's words obscured due to lack of microphone]

  • Oh yeah, question is whether it's open-book,

  • so we're going to say closed note, closed book.

  • So just,

  • Yeah, yeah, so that's what we've done in the past

  • is just closed note, closed book, relatively

  • just like want to check that you understand

  • the intuition behind most of the stuff we've presented.

  • So, a quick recap as a reminder of what we were talking

  • about last time.

  • Last time we talked about fancier optimization algorithms

  • for deep learning models including SGD Momentum,

  • Nesterov, RMSProp and Adam.

  • And we saw that these relatively small tweaks

  • on top of vanilla SGD, are relatively easy to implement

  • but can make your networks converge a bit faster.

  • We also talked about regularization,

  • especially dropout.

  • So remember dropout, you're kind of randomly setting

  • parts of the network to zero during the forward pass,

  • and then you kind of marginalize out over that noise

  • in the back at test time.

  • And we saw that this was kind of a general pattern

  • across many different types of regularization

  • in deep learning, where you might add some kind

  • of noise during training, but then marginalize out

  • that noise at test time so it's not stochastic

  • at test time.

  • We also talked about transfer learning where you

  • can maybe download big networks that were pre-trained

  • on some dataset and then fine tune them for your

  • own problem.

  • And this is one way that you can attack a lot of problems

  • in deep learning, even if you don't have a huge

  • dataset of your own.

  • So today we're going to shift gears a little bit

  • and talk about some of the nuts and bolts

  • about writing software and how the hardware works.

  • And a little bit, diving into a lot of details

  • about what the software looks like that you actually

  • use to train these things in practice.

  • So we'll talk a little bit about CPUs and GPUs

  • and then we'll talk about several of the major

  • deep learning frameworks that are out there in use

  • these days.

  • So first, we've sort of mentioned this off hand

  • a bunch of different times,

  • that computers have CPUs, computers have GPUs.

  • Deep learning uses GPUs, but we weren't really

  • too explicit up to this point about what exactly

  • these things are and why one might be better

  • than another for different tasks.

  • So, who's built a computer before?

  • Just kind of show of hands.

  • So, maybe about a third of you, half of you,

  • somewhere around that ballpark.

  • So this is a shot of my computer at home

  • that I built.

  • And you can see that there's a lot of stuff going on

  • inside the computer, maybe, hopefully you know

  • what most of these parts are.

  • And the CPU is the Central Processing Unit.

  • That's this little chip hidden under this cooling fan

  • right here near the top of the case.

  • And the CPU is actually relatively small piece.

  • It's a relatively small thing inside the case.

  • It's not taking up a lot of space.

  • And the GPUs are these two big monster things

  • that are taking up a gigantic amount of space

  • in the case.

  • They have their own cooling,

  • they're taking a lot of power.

  • They're quite large.

  • So, just in terms of how much power they're using,

  • in terms of how big they are, the GPUs are kind of

  • physically imposing and taking up a lot of space

  • in the case.

  • So the question is what are these things

  • and why are they so important for deep learning?

  • Well, the GPU is called a graphics card,

  • or Graphics Processing Unit.

  • And these were really developed, originally for rendering

  • computer graphics, and especially around games

  • and that sort of thing.

  • So another show of hands, who plays video games at home

  • sometimes, from time to time on their computer?

  • Yeah, so again, maybe about half, good fraction.

  • So for those of you who've played video games before

  • and who've built your own computers,

  • you probably have your own opinions on this debate.

  • [laughs]

  • So this is one of those big debates in computer science.

  • You know, there's like Intel versus AMD,

  • NVIDIA versus AMD for graphics cards.

  • It's up there with Vim versus Emacs for text editor.

  • And pretty much any gamer has their own opinions

  • on which of these two sides they prefer

  • for their own cards.

  • And in deep learning we kind of have mostly picked

  • one side of this fight, and that's NVIDIA.

  • So if you guys have AMD cards,

  • you might be in a little bit more trouble if you want

  • to use those for deep learning.

  • And really, NVIDIA's been pushing a lot for deep learning

  • in the last several years.

  • It's been kind of a large focus of some of their strategy.

  • And they put in a lot effort into engineering

  • sort of good solutions to make their hardware

  • better suited for deep learning.

  • So most people in deep learning when we talk about GPUs,

  • we're pretty much exclusively talking about NVIDIA GPUs.

  • Maybe in the future this'll change a little bit,

  • and there might be new players coming up,

  • but at least for now NVIDIA is pretty dominant.

  • So to give you an idea of like what is the difference

  • between a CPU and a GPU, I've kind of made a little

  • spread sheet here.

  • On the top we have two of the kind of top end Intel

  • consumer CPUs, and on the bottom we have two of

  • NVIDIA's sort of current top end consumer GPUs.

  • And there's a couple general trends to notice here.

  • Both GPUs and CPUs are kind of a general purpose

  • computing machine where they can execute programs

  • and do sort of arbitrary instructions,

  • but they're qualitatively pretty different.

  • So CPUs tend to have just a few cores,