Placeholder Image

Subtitles section Play video

  • LAURENCE MORONEY: Hi, and welcome to episode 5

  • of our Natural Language Processing with TensorFlow

  • series.

  • In this video, we're going to take a look

  • at how to manage the understanding of context

  • in language across longer sentences, where we can see

  • that the impact of a word early in the sentence

  • can determine the meaning and semantics

  • of the end of the sentence.

  • We'll use something called an LSTM, or Long Short Term

  • Memory, to achieve this.

  • So for example, if we're predicting text

  • and the text looks like this--

  • today has a beautiful blue something--

  • it's easy to predict that the next word is probably sky,

  • because we have a lot of context close to the word,

  • and most notably the word blue.

  • But what about a sentence like this one--

  • I lived in Ireland, so I learned how to speak something?

  • How do we predict the something?

  • The correct answer, of course, is Gaelic, not Irish,

  • but that's close enough.

  • And you and I could interpret that, but how do we do that?

  • What's the keyword that determines this answer?

  • Of course, it's the word Ireland, because in this case,

  • the country determines the language.

  • But the word is very far back in the sentence.

  • So when using a recurrent neural network,

  • this might be hard to achieve.

  • Remember, the recurrent neural networks we've been looking at

  • are a bit like this, where there's

  • a neuron that can learn something and then pass context

  • to the next timestamp.

  • But over a long distance, this context can be greatly deluded,

  • and we might not be able to see how meanings in faraway words

  • dictate overall meaning.

  • The LSTM architecture might help here,

  • because it introduces something called a cell state, which

  • is a context that can be maintained

  • across many timestamps, and which

  • can bring meaning from the beginning of the sentence

  • to bear.

  • It can learn that Ireland denotes Gaelic as the language.

  • What's fascinating is that it can also be bi-directional,

  • where it might be that later words in the sentence

  • can also provide context to earlier ones

  • so that we can learn the semantics of the sentence

  • more accurately.

  • I won't go into the specifics of LSTMs in this video,

  • but if you want to learn how they work in depth,

  • the deep learning specialization from Deep Learning AI

  • is a great place to go.

  • So we've seen in theory how they work.

  • But what does this look like in code?

  • Let's dive in and take a look.

  • Let's consider how we would use an LSTM and a classifier

  • like the sarcasm classifier we saw in an earlier video.

  • It's really quite simple.

  • We first define that we want an LSTM-style layer.

  • This takes a numeric parameter for the number of hidden nodes

  • within it, and this is also the dimensionality of the output

  • space from this layer.

  • If you wanted to be bi-directional,

  • you can then wrap this layer in a bi-directional like this,

  • and you're good to go.

  • Remember that this will look at your sentence

  • forwards and backwards, learn the best parameters for each,

  • and then merge them.

  • It might not always be best for your scenario,

  • but it is worth experimenting with.

  • LSTMs can use a lot of parameters,

  • as a quick look at this model summary can show you.

  • Note that there are 128 in the LSTM layer,

  • because we're doing a bi-directional using 64

  • in each direction.

  • You can, of course, also stack LSTM layers

  • so that the outputs of one layer get fed into the next,

  • a lot like with dense layers.

  • Just be sure to set return sequences to true on all layers

  • that are feeding another.

  • So in a case like this, where we have two,

  • the first should have it.

  • If you have three LSTM layers stacked,

  • the first two should have it, and so on.

  • And a summary of this model will show the extra parameters

  • that the extra LSTMs give.

  • So now you've seen the basis of recurrent neural networks,

  • including long short term memory ones.

  • You've also seen the steps in pre-processing text

  • for training a neural network.

  • In the next video, you'll put all of this

  • together and start with a very simple neural network

  • for predicting and thus creating original text.

  • I'll see you there.

  • And for more videos on AI in TensorFlow,

  • don't forget to hit that Subscribe button.

  • [MUSIC PLAYING]

LAURENCE MORONEY: Hi, and welcome to episode 5

Subtitles and vocabulary

Click the word to look it up Click the word to find further inforamtion about it