Placeholder Image

Subtitles section Play video

  • Hi, I'm Adriene Hill, and Welcome back to Crash Course Statistics. In previous episodes

  • we've talked about things like cars learning how to drive themselves...and apps that can

  • recognize handwriting and turn it into printed text.

  • A lot of these projects are done using a type of Machine Learning called a Neural Network.

  • The term Neural Network covers a bunch of different--but related--methods that can take

  • in data and spit out useful outputs.

  • Neural networks can output everything from the probability of someone getting a particularly

  • nasty strain of MRSA on their next hospital stay, to new chapters of Harry Potter...seriously.

  • They may even be behind some of the annoying Twitter bots that just seem to spout tweets

  • that rile people up. Today, we're going to take a look at the

  • big picture of what neural networks are, and how they do all these things.

  • INTRO

  • In Crash Course Computer Science, we talked a little bit about what a neural network is.

  • In the simplest sense, a neural network looks at data and tries to figure out the function--or

  • set of calculations--that turns the input... variables...into the output.

  • That output could be a number, a probability, or even something a bit more complicated.

  • Neural networks are analogous to robots that can learn to make things---like a toy car--not

  • by following step by step instructions from humans, but by looking at a bunch of toy cars

  • and figuring out for itself how to turn inputs (like metal and plastic) into outputs (the

  • toy cars)!

  • If we want to work with data instead of toy cars we can use a neural network to predict

  • future salary based on a number of variables such as degree, field, age, years of experience,

  • gender, number of promotions, and university.

  • We feed these variables to the neural network. These circles are called Nodes, and they just

  • hold a value like degree or field.

  • Eventually we want the Neural Network to output its prediction for future salary. So we know

  • there will be one output node at the end of our network that tells us what it predicts

  • the salary will be .

  • At this point, the Neural Network looks kinda like a regression, we have a bunch of inputs...our

  • variables...which are combined in some way to create an output...our predicted value.

  • But unlike most regressions, neural networks feed the weighted sum of age, degree, field,

  • etc through something called anactivation functionwhich takes the value and transforms

  • it before returning an output.

  • These activation functions improve the way many neural networks learn, and give them

  • more flexibility to model complex relationships between input and output.

  • One common activation function is called Rectified Linear Unit (ReLU) --which turns all negative

  • values to 0, and leaves positive ones as they are.

  • This makes these nodes act a little bit like neurons in your brain--hence the name neural

  • network-- which require a certainthresholdof activation before they'll fire. So a

  • node with 0 doesn't fire, or contribute to the output at all. But one with a positive

  • value will.

  • This Neural Network currently has two layers--input and output. But we can add layers between

  • them. So now the inputs are indirectly connected

  • to the output, through the middle layer of nodes.

  • It's pretty clear what the input nodes are, since they're values we understand. And

  • the output node is a salary, so we get that too. But it can be harder to grasp exactly

  • what the middle layers represent.

  • You can think of all the calculations that happen between the input nodes and output

  • nodes as something calledfeature generation”. “Featureis just a fancy word for a variable

  • that can be made up of a combination of other variables.

  • For example we could use your grades, attendance, and test scores to create a “Feature

  • called Academic Performance. Essentially the neural network is taking the variables we

  • give it, and performing combinations and calculations to create new values, orfeatures”.

  • Then, it combines thosefeaturesto create an output.

  • When we have large amounts of complex data, the neural network saves us a LOT of time

  • by combining variables and figuring out which ones are important. Neural Networks allow

  • us to make use of data that might seem too big and overwhelming for us to try to use

  • on its own. They can find patterns that humans might never be able to see.

  • If a neural network has more than one layer, we say that we're usingDeep Learning”,

  • since there are many layers of nodes. Deep Learning has gained popularity in recent years.

  • Neural networks and deep learning have been used extensively to do things like recognize

  • handwritten numbers and simulate x-ray images so airport security can be trained to recognize

  • items like drugs and guns.

  • There's a lot more math that goes into neural networks . But in short, they learn by figuring

  • out what they got wrong, and then working backwards to determine what values and connections

  • made the output incorrect.

  • For example, if it predicted my salary and is $10,000 off, it will take that difference

  • and figure out which parts of the neural network were influential in creating that $10,000

  • error. It then tweaks them so that next time, it's not as wrong.

  • You can see that in this neural network--sometimes called a Feed Forward Neural Network--all

  • the nodes only feed into the next layer from input to output. Hence, they only Feed information

  • Forward.

  • But it is possible to feed the output of a Neural Net back into the model as an input

  • the next time you run it. In other words, nodes in one layer can be connected to each

  • other, even themselves! These types of Neural Networks are called Recurrent Neural Networks.

  • We can use RNNs to learn patterns. For example, words! RNNs have been used to spell check

  • text. The Network can learn to take in a misspelled word like this... and correct it.

  • Often we use this kind of network when we have sequential data-- like stock prices over

  • time, or the words in a sentence. If you're trying to predict the words in a sentence,

  • it matters a lot what the previous word was.

  • If the previous word was “A”, that influences what the current word is. Usually the word

  • “A” precedes a noun, or an adjective -- one that starts with an consonant. A Fox. A Quick,

  • Brown Fox. But it's unlikely to precede a verb. “A walkedwouldn't make sense.

  • But the further you get through the sentence, the less influence the word “A” has.

  • Unlike Feed Forward Neural Networks, Recurrent Neural Networksrememberthe previous

  • outputs. For example, if we used a Recurrent Neural Networks to generate a melody, we would

  • give the network some information about our song framework, and we'd ask it for a note.

  • Then we feed that note back into the model along with the information about our song

  • framework and the network would generate the next note.

  • In order to make a melody that sounds good, the Recurrent Neural Network needs toremember

  • what the previous notes were. Using the outputs as inputs allows us to do that.

  • A popular type of Recurrent Neural Network called a Long Short-Term Memory Network has

  • been used to generate all kinds of music. It's even been used to write a few new Harry

  • Potter chapters.

  • ahem Here is one of those chapters from a Recurrent Neural Network trained by Max Deutsch

  • The Malfoys!” said Hermione.

  • Harry was watching him. He looked like Madame Maxime. When she strode up the wrong staircase

  • to visit himself.

  • “I'm afraid I've definitely been suspended from power, no chance — indeed?” said

  • Snape.

  • He put his head back behind them and read groups as they crossed a corner and fluttered

  • down onto their ink lamp, and picked up his spoon.

  • The doorbell rang. It was a lot cleaner down in London.

  • So, J.K. Rowling isn't out of a job yet. This excerpt doesn't make sense within the

  • context of the Harry Potter universe, or really make sense at all. But it at least has the

  • structure of a book chapter.

  • We can also use Neural Networks to look at another form of art: images. A lot of applications

  • of image recognition use a type of Neural Network called a Convolutional Neural Network.

  • Images are made up of a grid of pixels.

  • A very tiny grayscale image like this could be represented by a grid like this ...where

  • each number represents how much black is in that pixel. 0 is complete black, 1 is complete

  • white, and anything in between is a shade of gray.

  • Color images are a little more complicated, since each pixel has a red, green, and blue

  • value, but the idea is similar.

  • In this case, a pixel is affected by all the pixels surrounding it. It's not simple sequential

  • data. So, convolutional neural networks look atwindowsof pixels instead of one

  • pixel at a time.

  • They apply a filter to these windows to createfeatures”. This step is called convolution.

  • The filters that the network uses are just calculations that transform the pixels that

  • are inside the window. The network uses the data to determine which windows and filters

  • will be used.

  • Some filters might help detect edges in the image

  • Others might recognize features like curves, horizontal lines, or even more complex objects

  • like eyes, or faces. These features make it so we can take an image...which has a huge

  • number of pixels...and make a smaller number of features.

  • This process is called pooling. In the end, the network will use the features generated

  • by convolution and pooling to give us some kind of output, like a decision about whether

  • or not an image contains a stop sign, or a human face.

  • Snapchat, for example, has used variations of convolutional neural networks in their

  • app. And these networks are used extensively in all kinds of image recognition.

  • If you hate those CAPTCHAs that ask you to click on each image that has a stop sign,

  • you could use a convolutional neural network to fill them out for you.

  • And the next time you're in another country, you can use Google's Translate app which

  • uses these networks to help translate the text from signs or menus into your language.

  • One thing that limits our use of neural networks of all kinds is a lack of data. The more complex

  • these networks are, the more data they need to perform well.

  • But some neural networks can be trained to generate data. These are called Generative

  • Adversarial Networks (GANs). They use sets of existing data to try to learn how to create

  • new data. These networks are kinda like two neural networks...disguised as one...by wearing

  • a trenchcoat.

  • We'll illustrate how they work, with an analogy. Let's say you're a counterfeiter

  • who's trying to make fake $100 bills. You examine a few $100 bills create a fake

  • and then try to use it at your local convenience store. If the bill is rejected, you politely

  • ask the cashier what made them realize the bill was fake. And they're happy to help.

  • They tell you, you take this information back to your counterfeiting lab, and make a new,

  • better fake $100 bill.

  • You repeat this process over and over--hopefully the cashiers don't start to recognize you...and

  • eventually, you should have a passable fake bill. (Assuming you aren't already in jail.)

  • However, since the cashiers are seeing so many fake bills, they get better at recognizing

  • them as time goes on.

  • In our analogy, you are the generator. Your job is to make fake input...in this case $100

  • bills that are good enough totrickthe convenience store. The cashier is the

  • discriminator since her job is to learn to discriminate between real and fake $100 bills.

  • Essentially you have two neural networks battling it out to create better and better outputs.

  • The generator is trying to get better and better at making data that can trick the discriminator.

  • And the discriminator is trying to learn how to best discriminate between fake and real

  • data.

  • These networks have been used to create new anime characters , make new Van Gogh-like

  • art, and create new skate decks

  • Neural Networks of all kinds help us deal with the big, sometimes messy data that we

  • have in real life. They help detect patterns in data that humans can't see.