Placeholder Image

Subtitles section Play video

  • (upbeat ambient music)

  • - I'm hoping that I'm gonna tell you something

  • that's interesting and, of course,

  • I have this very biased view,

  • which is I look at things from my computational lens

  • and are there any computer scientists in the room?

  • I was anticipating not, but okay, there are,

  • so there's one, maybe every now

  • and then I'll ask you a question,

  • no, no, no, I'm just kidding, but, so,

  • and then so my goal here is gonna be to basically,

  • actually just give you a flavor of what is machine learning,

  • this is my expertise, and so just, actually,

  • again, to get a sense of who's in the room,

  • like, if I picked on someone here,

  • like raise your hand if you would be able to answer

  • that question, like, what is machine learning?

  • Okay, a handful, no, actually one, or two.

  • Great, okay, so I just want to give you a sense

  • of that, and I'm gonna, you know,

  • most of this is gonna be pretty intuitive,

  • I'll try to make little bits of it concrete

  • that I think will be helpful,

  • and then I'll tell you how we use machine learning

  • to improve guide designs, specifically

  • for knockdown experiments, but I think a lot

  • of it is probably useful for more than that,

  • but we haven't sort of gone down that route,

  • and so I can't say very much about that.

  • And please interrupt me if something doesn't make sense

  • or you have a question, I'd rather do

  • that so everybody can kind of stay on board rather

  • than some, you know, it makes less

  • and less sense the longer I go.

  • Alright, so machine learning, actually, during my PhD,

  • the big, one of the big flagship conferences was peaking

  • at around 700 attendees, and when I go now,

  • it actually is capped, like, it's sold out at 8,000 like,

  • months in advance, 'cause this field is just like,

  • taken off, basically it's now lucrative for companies,

  • and it's become a really central part of Google,

  • Microsoft, Facebook, and all the big tech companies,

  • so this field has changed a lot,

  • and kind of similar to CRISPR,

  • there's an incredible amount of hype and buzz

  • and ridiculous media coverage and

  • so it's a little bit funny, in fact,

  • that I'm not working at these two kind of,

  • very hyped up areas.

  • But anyway, so, you know,

  • people in just the mainstream press now,

  • you're always hearing about artificial intelligence

  • and deep neural networks, and so these are like,

  • so I would say machine learning is a sub-branch

  • of artificial intelligence,

  • and a deep neural network is sort

  • of an instance of machine learning, and so like,

  • what really is this, this thing?

  • So it kind of overlaps sometimes

  • with traditional statistics, but the,

  • like, in terms of the machinery,

  • but the goals are very different and,

  • but, really like the core, fundamental concept here is

  • that we're gonna sort of pause at some model, so maybe like,

  • think linear regression is a super simple model,

  • and you can like, expose it to data, it has some parameters,

  • right, the weights, and then we essentially want

  • to fit those weights, and that's the training,

  • that's literally the machine learning.

  • So I'm sorry if that sounds super simple

  • and not like, God-like, like machine learning

  • and everything working magically,

  • but that really is what it is,

  • and, right, and so let me just also give you like,

  • sort of drive home that point.

  • So we're gonna pause at some sort of model,

  • and so here I'm giving you the simplest example

  • because I think most people here work

  • with linear regression at some point in their life,

  • and so you can think of this as a predictive model

  • in the sense that if I give it a bunch

  • of examples of Y and X, and I learn the parameter of beta,

  • then for future examples where I don't have Y

  • but I only have X, I can just compute,

  • X times beta, and I get a prediction of why.

  • So that's the sense in which I call this a predictive model,

  • and that's very much how machine learning people tend

  • to think of it, where statisticians are often very focused

  • on what is beta, what are the confidence intervals

  • around beta and things like this.

  • So like, there's, that's the sense

  • in which there's a lot of overlap,

  • but the goals are kind of quite different.

  • We want to like, use real data

  • and make predictions, so here it's gonna be predictions

  • about guides, and which guides are effective

  • at cutting and at knockout.

  • Right, and so it has these free parameters,

  • and we call these things that we put in here features,

  • and so in the case of guide design,

  • the question is gonna be, what features are we gonna put

  • in there that allow us to make these kinds of predictions,

  • and, so I'm gonna get into that in a little bit,

  • but just as an example to make this concrete,

  • it might be how many GCs are in this 30mer guide,

  • or guide plus context.

  • Right, and like I said, we're gonna call,

  • we're gonna give it some data,

  • and so in this case, the data for guide design is gonna be

  • data from (mumbles), there's a community

  • that's now publicly available where there are examples,

  • for example, what the guide was

  • and how effective the knockout was,

  • or what the cutting frequency was.

  • For example, I get a good, a bunch of these examples,

  • and then that's gonna enable me

  • to somehow find a good beta, and of course we're not,

  • actually, we do sometimes use linear regression,

  • but I'll tell you a little bit more about,

  • more sort of complex and richer models

  • that let us do a lot more, and then the goal is going

  • to be to fit this beta in a good way,

  • and like, I'm not gonna do some deep dive on that here,

  • but in the one way that you are publicly familiar

  • with is just means squared error,

  • and when you find the beta that minimizes this

  • for your example training data,

  • then you get some estimate of beta

  • and you hope that on unseen examples

  • when you do X times beta, it gives you a good prediction.

  • So is that sort of make it somewhat concrete,

  • what I mean by a predictive model

  • and how you could view linear regression

  • as a predictive model in how you might use this

  • for guide design?

  • Okay, so obviously I'll tell you a lot more.

  • So, right, but linear regression is just sort

  • of the simplest possible example,

  • and so in our work we actually use,

  • some of the time, what are called classification

  • or regression trees, and so in contrast

  • to here where you might have, say,

  • this, you might have a bunch of these features,

  • right, like how many GCs were in my guide,

  • and then another feature might be,

  • was there an A in position three,

  • and you can put in as many as you want,

  • and then you get all these betas estimated.

  • So it's very simple, because in that case,

  • none of these features can interact with each other,

  • right, you just, you know you just add X times beta one

  • plus X times beta two, so we call this like,

  • a linear additive model.

  • In contrast, these trees allow very sort

  • of deep interactions among the features,

  • so this might be how many GCs,

  • so, of course, this is just, I didn't,

  • this is not suited to the features I just described,

  • but this might be some feature like,

  • I don't know, proportion of GCs,

  • 'cause now it's fractional, and then it,

  • this algorithm, which is gonna train the betas,

  • so find a good value beta, well, sort of

  • through a procedure that I'm not gonna go into detail

  • for all these models, how it works,

  • but it's going to somehow look at the data

  • and determine that it should first split

  • on the second feature at this value,

  • and then it will sort of keep going down that.

  • It says, "Now partition the examples

  • "in my training data like this."

  • And then on the second feature in this way,

  • until you end up at the sort of leaves of this tree,

  • and these leaves are the predictions.

  • And so when you do it for the training data,

  • whichever training examples here end up at this leaf,

  • you basically take their mean,

  • and that's now the prediction for that leaf,

  • and if you take a new example,

  • you basically just pipe it through this,

  • these sort of rules, and you end up

  • with that kind of prediction.

  • This, simplified, but I think it's a good conceptual view,

  • and this is just another way of thinking

  • about it is if you only had two features

  • and you drew them, like, one against the other,

  • then effectively, every time you make a branch here,

  • you're kind of cutting up this space.

  • So that's also just another way to think about it.

  • And so, also, so this is, now all over the press nowadays,

  • and whenever I give these talks,

  • there's a bunch of young, hungry grad students who say,

  • "Did you do deep neural networks?"

  • 'Cause that's what everybody wants to do now,

  • and so deep neural networks, they're kind

  • of like a really fancy linear regression.

  • So you could think of these as the Xs in linear regression,

  • you can think of this as,

  • imagine there's only one thing out here,

  • I should have done a different picture, but that's just Y.

  • And so this again is a mapping where you give it the Xs,

  • you do a bunch of stuff, and out you get a Y here,

  • except linear regression, you know,

  • is this very simple thing,

  • and now you can see, there's all these other kinds

  • of, we call these like, hidden nodes,

  • and so there's this complicated mess now of parameters,

  • beta, and again, I'm not gonna go into it,

  • I just want to give you a sense that like,

  • linear regression is this very simple thing,

  • and there's a lot of other models that let you do much,

  • much better prediction, and these are typically the kinds

  • of models that we use, because they're more powerful

  • if we care about prediction.

  • But the flip side is they're actually very hard

  • to interpret, and so if you want to ask a question,

  • like, was it the GC feature that is most important

  • in your guide design, which is what I always,

  • you know, get a question like this,

  • and we can do our best to kind

  • of poke and prod at this machine,

  • but it's always a little bit ad hoc,

  • it's hard, the more complicated the model,

  • then the, you know, the better we might predict

  • and the less interpretable it is,

  • and so there's always this kind of tension.

  • So right, and so what are some of the challenges?

  • So I've sort of shown you sort of some,

  • like, increasing amount of complexity in some models,

  • and so one of the big difficulties is,

  • if I pause at a very complex model

  • with a lot of parameters, then I need a lot

  • of data in order to actually fit those parameters,

  • and if I don't have enough data,

  • what's gonna happen, is the parameters,

  • you're gonna find this very specific setting

  • of the parameters that effectively memorize

  • the training data, and the problem then is you

  • give it a new example that you really care about,

  • you say, I want to knock out that gene,

  • and it's never seen that gene,

  • 'cause it's sort of memorized, but we say it's like,

  • over fit to the data, it doesn't actually generalize well

  • to these unseen examples.