Placeholder Image

Subtitles section Play video

  • [MUSIC PLAYING]

  • LILY PENG: Hi everybody.

  • My name is Lily Peng.

  • I'm a physician by training and I work on the Google medical--

  • well, Google AI health-care team.

  • I am a product manager.

  • And today we're going to talk to you about a couple of projects

  • that we have been working on in our group.

  • So first off, I think you'll get a lot of this,

  • so I'm not going to go over this too much.

  • But because we apply deep learning

  • to medical information, I kind of wanted

  • to just define a few terms that get used quite a bit

  • but are somewhat poorly defined.

  • So first off, artificial intelligence-- this

  • is a pretty broad term and it encompasses that grand project

  • to build a nonhuman intelligence.

  • Machine learning is a particular type

  • of artificial intelligence, I suppose,

  • that teaches machines to be smarter.

  • And deep learning is a particular type

  • of machine learning which you guys have probably

  • heard about quite a bit and will hear about quite a bit more.

  • So first of all, what is deep learning?

  • So it's a modern reincarnation of artificial neural networks,

  • which actually was invented in the 1960s.

  • It's a collection of simple trainable units, organized

  • in layers.

  • And they work together to solve or model complicated tasks.

  • So in general, with smaller data sets and limited compute,

  • which is what we had in the 1980s and '90s,

  • other approaches generally work better.

  • But with larger data sets and larger model sizes

  • and more compute power, we find that neural networks

  • work much better.

  • So there's actually just two takeaways

  • that I want you guys to get from this slide.

  • One is that deep learning trains algorithms

  • that are very accurate when given enough data.

  • And two, that deep learning can do this

  • without feature engineering.

  • And that means without explicitly writing the rules.

  • So what do I mean by that?

  • Well in traditional computer vision,

  • we spend a lot of time writing the rules

  • that a machine should follow to make a certain prediction task.

  • In convolutional neural networks,

  • we actually spend very little time in feature

  • engineering and writing these rules.

  • Most of the time we spend in data preparation

  • and numerical optimization and model architecture.

  • So I get this question quite a bit.

  • And the question is, how much data is enough data

  • for a deep neural network?

  • Well in general, more is better.

  • But there are diminishing returns beyond a certain point.

  • And a general rule of thumb is that we

  • like to have about 5,000 positives per class.

  • But the key thing is good and relevant data--

  • so garbage in, garbage out.

  • The model will predict very well what you ask it to predict.

  • So when you think about where machine learning,

  • and especially deep learning, can make the biggest impact,

  • it's really in places where there's

  • lots of data to look through.

  • One of our directors, Greg Corrado, puts it best.

  • Deep learning is really good for tasks that you've done 10,000

  • times, and on the 10,001st time, you're just sick of it and you

  • don't want to do it anymore.

  • So this is really great for health care in screening

  • applications where you see a lot of patients

  • that are potentially normal.

  • It's also great where expertise is limited.

  • So here on the right you see a graph

  • of the shortage of radiologists kind of worldwide.

  • And this is also true for other medical specialties,

  • but radiologists are sort of here.

  • And we basically see a worldwide shortage of medical expertise.

  • So one of the screening applications

  • that our group has worked on is with diabetic retinopathy.

  • We call it DR because it's easier

  • to say than diabetic retinopathy.

  • And it's the fastest growing cause of preventable blindness.

  • All 450 million people with diabetes are at risk and need

  • to be screened once a year.

  • This is done by taking a picture of the back

  • of the eye with a special camera, as you see here.

  • And the picture looks a little bit like that.

  • And so what a doctor does when they get an image like this

  • is they grade it on a scale of one to five from no disease,

  • so healthy, to proliferate disease,

  • which is the end stage.

  • And when they do grading, they look for sometimes very subtle

  • findings, little things called micro aneurysms

  • that are outpouchings in the blood vessels of the eye.

  • And that indicates how bad your diabetes

  • is affecting your vision.

  • So unfortunately in many parts of the world,

  • there are just not enough eye doctors to do this task.

  • So with one of our partners in India,

  • or actually a couple of our partners in India,

  • there is a shortage of 127,000 eye doctors in the nation.

  • And as a result, about 45% of patients

  • suffer some sort of vision loss before the disease is detected.

  • Now as you recall, I said that this disease

  • was completely preventable.

  • So again, this is something that should not be happening.

  • So what we decided to do was we partnered

  • with a couple of hospitals in India,

  • as well as a screening provider in the US.

  • And we got about 130,000 images for this first go around.

  • We hired 54 ophthalmologists and built a labeling tool.

  • And then the 54 ophthalmologists actually

  • graded these images on this scale,

  • from no DR to proliferative.

  • The interesting thing was that there was actually

  • a little bit of variability in how doctors call the images.

  • And so we actually got about 880,000 diagnoses in all.

  • And with this labelled data set, we put it through a fairly well

  • known convolutional neural net.

  • This is called Inception.

  • I think lot of you guys may be familiar with it.

  • It's generally used to classify cats and dogs for our photo app

  • or for some other search apps.

  • And we just repurposed it to do fundus images.

  • So the other thing that we learned

  • while we were doing this work was

  • that while it was really useful to have

  • this five-point diagnosis, it was also

  • incredibly useful to give doctors

  • feedback on housekeeping predictions like image quality,

  • whether this is a left or right eye,

  • or which part of the retina this is.

  • So we added that to the network as well.

  • So how well does it do?

  • So this is the first version of our model

  • that we published in a medical journal in 2016 I believe.

  • And right here on the left is a chart

  • of the performance of the model in aggregate

  • over about 10,000 images.

  • Sensitivity is on the y-axis, and then 1 minus specificity

  • is on the x-axis.

  • So sensitivity is a percentage of the time when

  • a patient has a disease and you've

  • got that right, when the model was calling the disease.

  • And then specificity is the proportion

  • of patients that don't have the disease that the model

  • or the doctor got right.

  • And you can see you want something

  • with high sensitivity and high specificity.

  • And so up and to the right--

  • or up and to the left is good.

  • And you can see here on the chart

  • that the little dots are the doctors that

  • were grading the same set.

  • So we get pretty close to the doctor.

  • And these are board-certified US physicians.

  • And these are ophthalmologists, general ophthalmologists

  • by training.

  • In fact if you look at the F score, which

  • is a combined measure of both sensitivity and specificity,

  • we're just a little better than the median ophthalmologist

  • in this particular study.

  • So since then we've improved the model.

  • So last year about December 2016 we were sort of on par

  • with generalists.

  • And then this year--

  • this is a new paper that we published--

  • we actually used retinal specialists

  • to grade the images.

  • So they're specialists.

  • We also had them argue when they disagreed

  • a