Placeholder Image

Subtitles section Play video

  • Alexa.

  • How do we add something to my shopping list?

  • According to Wiki, How to make a shopping list?

  • That's not what I meant.

  • He thinks he recently.

  • No, I meant thing is actually a very useful thing if you didn't know how to make a shopping list, but it's not to do with my shopping list, he said.

  • Paper your friend Alexa, stop today I think we're talking about.

  • Well, voice interface is, I think, on the Amazon Echo, which uses the Alexis service, which I'm sure a lot off viewers will have used or have one or something like that.

  • We've got it on muted a minute.

  • Yeah, because, you know, it would be quite irritating if it wasn't on meat, especially talking about Alexa without actually addressing Alexa people in our studies.

  • Actually, the people who are using the devices have to work out ways to talk about it without addressing it.

  • So how'd she had become something you gotta do?

  • In essence, we say something like question asked for information, perhaps want to play a game or something like that on dhe.

  • The device hears it on dhe.

  • Then it responds on dhe, you know, gives us the information we want.

  • Perhaps add something to our shopping list or whatever it might be, you know, gives you directions, you know, going to some place on the map, whatever.

  • And that's essentially what happens.

  • So why don't you ask something to Alexa?

  • Alexa, What is computer file?

  • The definition of computer file is computer science a farm maintained in computer readable form?

  • Question.

  • No, that's a tricky one, because there's ambiguity that right, Because I have asked about what I'm talking about.

  • Computer files.

  • One word, and it's taken it as being two words, right?

  • Yeah, yes.

  • Oh, that's so we think we're in one situation, but the Alexa thinks we're in another situation.

  • So now it's just triggered again.

  • Thanks for Alexa.

  • Stop in the description.

  • I'm gonna go through kind of some of the basic things that happened, but obviously it's much more complex than this if we give the example of kind of thinking about a shopping list So with relax, you can maintain a shopping list.

  • You can add stuff from move stuff, whatever might be.

  • So what we're gonna do is I'm going to say something like, you know, Alexa, could you tell me what's on my shopping list.

  • So this is us saying something.

  • See?

  • No smiley face speech, bubble drawer, speech bubble.

  • You know this is away form effectively when it's picked up by the device.

  • Sound The sound wave.

  • What the device is going to do is going to take this stuff, and it's going to run it through automatic speech recognition.

  • S R This is detecting what was said, essentially the first thing the ASR is doing.

  • But it's local to the device is picking up the wake word Alexa.

  • The Wake word is the first thing that's being detected.

  • So there is some kind of onboard speech recognition going on on the device itself.

  • Toe workout.

  • When Alexis being said now, the rest of this stuff that kind of the Could you tell me what's on my shopping list that's being shipped off to the cloud for speech recognition being done on the rest of this sentence that we're saying to the device on.

  • That's kind of passing through these very sophisticated, complex, steep learning models.

  • You know, this is within one of the major innovations off these devices is actually having air, so that works pretty Well, I'm not saying it works for everyone, but it works well for a lot of people, at least compared to how things used to be.

  • So it's shipping all this stuff off into the cloud.

  • Could you tell me what's on my shopping list?

  • It's transcribing those into bitter text, essentially.

  • So I got this big attacks that says, Could you tell me what's on my shopping list?

  • And now we need to do something with that text.

  • We need to make sense of it in some way, and one of the first stage is that it goes through is something called Natural Language Processing and LP, or natural language Understanding.

  • And this is taking this text and breaking it up into things that are meaningful from the point of view of the system, essentially from 40 view of the Alexis service.

  • And that's not gonna be everything, but some stuff is gonna get chucked away.

  • So I would guess it is just my guess that the things that's happening in the natural language, processing of natural language, understanding elements off the kind of the cycle.

  • If you like things like shopping lists, that's something that it knows about Perhaps, you know, could you not really necessary that useful?

  • So probably being chucked away, That's commerce almost like a politeness.

  • Yeah, yeah, yeah, yeah, exactly.

  • So it's kind of redundant from the point of view, this system, although if we talk about actual conversation and talk, it's certainly not redundant in actual talk.

  • It's meaningful.

  • Probably something like Tell Me is is significant phrase again that's being passed out on noticed by the Parsa on maybe some of the other bits about my shopping list.

  • So rather than shorten shopping list or something like that, so it's passing these things out.

  • That sentence is starting to decompose into things that are meaningful from the system point of view.

  • After this stage, what we're now looking at is, you know, we got to do something with these bits of sentence that we've passed.

  • This is obviously a simplification on, but sure, there are lots of other architectures around there and different to this one.

  • But I'm just gonna go with kind of why you know.

  • Then there's something called a dialogue manager, and this thing is taking all those bits, the parts bits.

  • We know we know something about you know, subject in the object on whatever it might be.

  • In this case, you know that the meaningful thing might be shopping list on that kind of command to tell me in the fact that it's my shopping list, not someone else's.

  • And the dialogue manages taking all these bits and pieces, and it's got to come up with the next response.

  • Sorry, a response to it.

  • In this case, it might be, you know, there's nothing currently on your shopping list or whatever, Alexis actually says as a result of that, that command.

  • So you know he's got to generate something.

  • But in the quarter doing so the dialogue manager's gonna do all sorts of other stuff.

  • It's got to kind of put you, perhaps in some kind of conversational flow, or like a state, it's going to be looking at kind of what stage you're at in the current assumed from the system's point, if you assumed a seemed conversational, you know ST might be, And then it's got Thio drawn, other resources as well with a shopping list.

  • It's kind of something about Amazon Service's come into play, so you know where it's storing this information.

  • I'm no idea where it's storing it in the cloud somewhere.

  • But it's retrieving that information about what's actually on the shopping list.

  • This is like data might be other stuff that we're looking up like, perhaps a Web resources.

  • You know, if you asked about information about a particular topic, it's gotta scrape that stuff off.

  • The Web will grab it somehow.

  • It's gonna feed that into the dialogue manager, which is doing this kind of generating next.

  • Next responses.

  • So, Mike, in a simplified version, there's something about the kind of state we're in, the conversational flow.

  • Perhaps there's Maur questions that the device is going to ask after this or whatever it might be to clarify things, whatever.

  • There's some other resources that it might be drawing on to feed data into the response.

  • And then it's generating response, which might be, as I said, you know, you have no items on your shopping list, which is going No, that's kind of what it's coming out with, but then that's just the kind of, you know, text output.

  • It's gotto say this.

  • So the next stages todo text to speech that's gonna generate speech based on what this text is your shopping list is empty.

  • There's a whole load of complex stuff around speaks generation.

  • You know, there's a whole whole area of research, um, about how you actually go about doing that, which is again, very sophisticated and complex.

  • And then it comes out of the device, the echo as the response.

  • And so we hear it.

  • The really heavy stuff is this stuff here, like the ASR stuff here with speech recognition stuff.

  • That's where you need ah Holton of data and quite, you know, significant models that you've learned, which then you put an input which is thes bits of audio and you get out, put nuts, you know, which is these bits of text, which which are which map, you know, there's another is obviously a confidence associated with that.

  • That's a whole massive, complex area.

  • So that's the stuff that really relies on on, you know, this kind of significant competing power.

  • Also, you know the passing they're gonna be wanting to update out what time is that has to be ex service, but in terms of the Texas speech, as they want to kind of updated and change it and pushing out updates the device, but problems That might be why you you would actually be shipping already?

  • That's something we could we could find out.

  • We'll be told by my commenters.

  • Sure.

  • The reason for doing this rather artificial example is to say, Oh dear, does this matter?

  • We have got a sentence that makes perfect sense to us.

  • One picture of Sean, right?

  • So maybe what Miles gets put over here near May, which is not so good, but we'll get to that, and then you're put over here like this.

Alexa.

Subtitles and vocabulary

Click the word to look it up Click the word to find further inforamtion about it