Placeholder Image

Subtitles section Play video

  • [MUSIC PLAYING]

  • SANDEEP GUPTA: My name is Sandeep Gupta.

  • I'm a product manager in Google.

  • YANNICK ASSOGBA: And I'm Yannick Assogba,

  • and I'm a software engineer on the TensorFlow.js team.

  • SANDEEP GUPTA: And we are here to talk to you today

  • about machine learning and JavaScript.

  • So the video that you just saw, this

  • was our very first AI-inspired Google Doodle.

  • And it was able to bring machine learning

  • to life in a very fun and creative way

  • to millions of users.

  • And what users were able to do with this

  • is that you're able to use a machine learning model directly

  • running in the browser that was able to synthesize

  • a box-style harmony.

  • And what made this possible was this library called

  • TensorFlow.js.

  • So TensorFlow.js is an open-source library for machine

  • learning in JavaScript.

  • It's part of the TensorFlow family of products,

  • and it's built specifically to make it easier for JavaScript

  • developers to build and use machine

  • learning models within their JavaScript applications.

  • You use this library in one of three ways.

  • You can use one of the pre-existing pre-trained models

  • that we provide, and directly run them within your JavaScript

  • applications.

  • You can use one of the models that we have packaged for you,

  • or you can take pretty much any TensorFlow model that you have

  • and use a converter and run it with TensorFlow JavaScript.

  • You can use a previously-trained model,

  • and then retrain it with your own data to customize it,

  • and this is often useful to solve the problem that's

  • of interest to you.

  • This is done using a technique called transfer learning.

  • And then lastly, it's a full-feature JavaScript library

  • that lets you write and alter models

  • directly with JavaScript.

  • And so you can create a completely new model

  • from scratch.

  • Today in this talk, we will talk a lot about the first

  • and the third one of these.

  • For the re-training examples, there

  • are a bunch of these on our website and in the codelabs,

  • and we encourage you to sort of take a look after the talk.

  • The other part is that JavaScript

  • is a very versatile language, and it works

  • on a variety of platforms.

  • So you can use TensorFlow.js on all of these platforms.

  • We see a ton of use cases in the browser,

  • and it has a lot of advantages because, you know,

  • browser is super interactive.

  • You have easy access to sensors, such as webcam and microphone,

  • which you can then bring into your machine learning models.

  • And also we use WebGL-based acceleration.

  • So if you have a GPU in your system,

  • you can take advantage of that and get

  • really good performance.

  • TensorFlow.js will also run server-side using Node.js.

  • It runs on a variety of mobile platforms in iOS and Android

  • using mobile web platforms, and also it

  • can run in desktop applications using Electron.

  • And we'll see, later in the talk, more examples of this.

  • So we launched TensorFlow.js one year back last March,

  • and then earlier this year at our developer summit

  • we released version 1.0.

  • And we have been amazed to see really good adoption and usage

  • by the community, and some really good sort of popularity

  • numbers.

  • We are really, really excited to see

  • more than 100 external contributors who

  • are contributing to and making the library better.

  • So for those of you who are in the audience or those

  • of you listening, thank you very much from all

  • of the TensorFlow.js team.

  • So let's dive a little bit deeper into the library

  • and see how it is used.

  • OK, I'm going to start with looking

  • at some pre-trained models first.

  • So I want to show you a few of these today.

  • So we have packaged a variety or a collection

  • of pre-trained models for use out of the box

  • to solve some of the most common types of ML problems

  • that you might encounter.

  • These work with images.

  • So we-- for tasks such as image classification,

  • detecting objects, segmenting objects, and finding boundaries

  • of objects, recognizing human gesture and human

  • pose from image or video data.

  • We have a few speech audio models,

  • which work with speech commands to recognize spoken words.

  • We have a couple of text models for analyzing, understanding,

  • and classifying text.

  • All of these models are packaged with very easy-to-use wrapped

  • APIs for easy consumption in JavaScript applications.

  • You can either NPM install them, or you can directly

  • use them from our hosted scripts with nothing to install.

  • So let's take a look at two examples.

  • The first model I want to show you is an image-based model.

  • It's called BodyPix.

  • So this is the model that lets you take image data,

  • and it finds whether there is a person in that image or not.

  • And if there is a person, it will segment out

  • the boundary of that person.

  • So it will label each pixel as whether it

  • belongs to the person or not.

  • And you can also do body part segmentation.

  • So it can further divide up the pixels that belong to a person

  • into one of 24 body parts.

  • So let's take a look at what the code looks like

  • and how you would use a model like this.

  • So you start by loading the library

  • and by loading the model using the script

  • tag from our hosted scripts.

  • You choose an image file.

  • You can load it from disk or you could point to a webcam element

  • to load it from the webcam.

  • And once you have an image, then you

  • create an instance of the BodyPix model,

  • and you call its person segmentation method

  • on the image that you have chosen.

  • Because this runs asynchronously,

  • you wait for the result and we do that

  • by using the wait keyword.

  • So once you get back the segmentation result,

  • it returns an object, and this object

  • has the width and the height of the image,

  • and also a binary array of zeros and ones, with the pixels where

  • the person is found are labeled, and you see that

  • in that image on the right.

  • You could also use the body parts segmentation method

  • instead of the person segmentation

  • method, in which case you would get the sub-body part

  • classification.

  • The model is packaged with a set of utility functions

  • for rendering, and here you see the example

  • of the drawPixelatedMask() function,

  • which produces this image on the right.

  • OK, so this is how you would use one of these image-based models

  • directly in your web application.

  • The second model I want to show you is a speech commands model.

  • So this is an audio model that will look for,

  • that will listen to microphone data,

  • and try to recognize some spoken words.

  • So you can use this to build voice controls and interfaces

  • or to recognize words for translation

  • and other types of applications.

  • So let me quickly switch to the demo laptop.

  • So we have a small glitch application written,

  • which uses the speech commands model,

  • and we are using a version of a pre-trained model, which

  • is trained on a vocabulary of just four simple words-- up,

  • down, left and right.

  • So when I click start and I can speak these words,

  • this application will display a matching emoji.

  • So let's try it out.

  • Left.

  • Up.

  • Left.

  • Down.

  • Down.

  • OK.

  • Right.

  • Left.

  • Up.

  • There we go.

  • We can go back to the screen.

  • This actually points to what you would frequently encounter

  • with machine learning models.

  • There are a lot of other factors that you have to account for--

  • things like background noise, and just training data

  • representing adequately the type of data that it

  • will encounter in real life.