Placeholder Image

Subtitles section Play video

  • ♪ (music) ♪

  • Hello, everyone.

  • First, thanks everyone for coming to attend the Dev Summit.

  • And second, thanks for staying around this long.

  • I know it's been a very long day.

  • And there has been a lot of information that we've been throwing at you.

  • But we've got much, much more and many more announcements to come.

  • So please stick with me.

  • My name is Clemens, and this is Raz.

  • We're going to talk about TensorFlow Extended today.

  • But before we do this, I'm going to do a quick survey.

  • Can I get a quick show of hands?

  • How many of you do machine learning in a research or academic setting?

  • Okay.

  • Quite a big number.

  • Now how many of you do machine learning in a production setting?

  • Okay.

  • That looks about half-half.

  • Obviously, also a lot of overlap.

  • So for those of you who do machine learning in a production setting,

  • how many of you have agreed with this statement?

  • Yeah? Some? Okay.

  • I see a lot of hands coming up.

  • So everyone that I speak with who's doing machine learning

  • in production agrees with this statement:

  • "Doing machine learning in production is hard," and it's too hard.

  • Because after all, we actually want to democratize machine learning

  • and get more and more people to allow them to deploy machine learning

  • in their products.

  • One of the main reasons why it's still hard is because in addition

  • to the actual machine learning.

  • So this small orange box where you actually use TensorFlow,

  • you may use Keras to put together your layers

  • and train your model.

  • You need to worry about so much more.

  • There's all of these other things that you have to worry about

  • to actually deploy machine learning in a production setting

  • and serve it within your product.

  • Now the good news is that this is exactly

  • what TensorFlow Extended is about.

  • TFX in [inaudible] Google is an [inaudible] machine learning

  • platform that allows our developers to go all the way from data to production

  • and serving machine learning models

  • as fast as possible.

  • Now before we introduce TFX,

  • we saw that going through this process

  • of writing some of these components, some of them didn't exist before.

  • Gluing them together and actually getting to

  • a launch took anywhere between six to nine months,

  • sometimes even a year.

  • Once we've deployed TFX and allow developers to use it,

  • in many cases, people can use this platform and get up and running

  • with it in a day and actually get to a deployable model in production

  • in the order of weeks or in just a month.

  • Now, TFX is a very large system and platform that consists

  • of a lot of components and a lot of services

  • so unfortunately I can't talk about all of this in the next 25 minutes.

  • So we're only going to be able to cover a small part of it but we're talking

  • about the things that we've already open sourced and made available to you.

  • First, we're going to talk about TensorFlow Transform

  • and show you how to apply transformations on your data

  • consistently between training and serving.

  • Next, Raz is going to introduce you to a new product that we're open sourcing

  • called TensorFlow Model Analysis.

  • We're going to give a demo of how all of this works together end to end

  • and then make a broader announcement of our plans for TensorFlow Extended

  • and sharing it the community.

  • Let's jump into TensorFlow Transform first.

  • So, a typical ML pipeline that you may see in the wild

  • is during training,

  • you usually have a distributed data pipeline that applies transformations

  • to your data.

  • Because usually you train in a large amount of data,

  • this needs to be distributed,

  • and you're on this pipeline

  • and sometimes materialize the output before you actually

  • put it into your trainer.

  • Now at serving time,

  • we need to find a way to somehow replay those exact transformations online.

  • As a new request comes in, it needs to be sent to your model.

  • There's a couple of challenges with this.

  • The first one is, usually those two things are very different code paths.

  • The data distribution systems that you would use for batch processing

  • are very different from the libraries and tools that you would use to--

  • in real time transform data to make a request to your model.

  • Now we have two different code paths.

  • Second, in many cases, it's very hard to keep those two in sync.

  • I'm sure a lot of you have seen this.

  • You change your batch processing pipeline and introduce a new feature or change

  • how it behaves and you somehow need to make sure that the code

  • that they actually use in your production system is changed

  • at the same time and is kept in sync.

  • The third problem is, sometimes you actually want to deploy

  • your TensorFlow machine learning model in many different environments.

  • You want to deploy it in a mobile device; you want to deploy in a server;

  • maybe you want to put it on a car; now suddenly you have

  • three different environments where you have to apply

  • these transformations, and maybe there's different languages

  • that you use for those, and it's also very hard

  • to keep those in sync.

  • And this introduces something that we call training serving skew,

  • where the transformations that you do at training time may be different

  • from the ones in serving time, which usually leads to bad quality

  • of your serving model.

  • TensorFlow Transform addresses this by helping you write

  • your data processing job at training time,

  • so actually help you create those data pipelines to do those

  • transformations, and at the same time,

  • it emits a TensorFlow graph that can be

  • in line with your training model and also your serving model.

  • Now what this does is, it actually hermetically seals the model,

  • and your model takes a raw data request as input,

  • and all of the transformations are actually happening

  • within the TensorFlow graph.

  • This is a lot of advantages, one of them is that you no longer

  • have any code in your serving environment that does these

  • transformations because they're all being done in the TensorFlow graph.

  • Another one is wherever you deploy this TensorFlow model,

  • all of those transformations are applied in a consistent way.

  • No matter where this graph is being evaluated.

  • Let's see how that looks like.

  • This is a code snippet of a pre-processing function

  • that you would write with TF Transform.

  • I'm just going to walk you through what happens here

  • and what we need to do for this.

  • First thing we do is normalize this feature.

  • As all of you know, in order to re-normalize a feature

  • we need to compute the mean and the standard deviation,

  • and to actually apply this transformation, we need to subtract by the mean

  • and divide by the center of deviation.

  • So what has to happen is, for the input feature X,

  • we have to compute these statistics which is a trivial task.

  • If the data fits into a single machine, you can do it easily.

  • It's a non-trivial task if you have a gigantic training data set

  • and actually have to compute these metrics...

  • ...effectively.

  • Once we have these metrics we can actually apply this transformation

  • to the feature.

  • This is to show you that the output of this transformation can then be,

  • again, multiplied with another tensor--

  • which is just a regular TensorFlow transformation.

  • And then in order to bucketize a feature, you also again need to compute

  • the bucket boundaries to actually apply this transformation.

  • And again, this is a distributed data job to compute those metrics for the result

  • of an already transformed feature.

  • This is another benefit to then actually apply this transformation.

  • The next examples just show you that in the same function it can apply

  • any other tensor in tensor [inaudible] function and there's also some

  • of what we call mappers in TF transform that don't require this analyze phase.

  • So, N-grams doesn't require us to actually run a data pipeline

  • to compute anything.

  • Now what happens here is that these orange boxes

  • are what we call analyzers.

  • We realize those as actual data pipelines that compute those metrics over your data.

  • They're implemented using Apache Beam.

  • And we're going to talk about this more later.

  • But what this allows us to do is actually run this distributor data pipeline

  • in different environments.

  • There's different runners for Apache Beam.

  • And all of the transforms are just simple instance to instance transformations

  • using pure TensorFlow code.

  • What happens when you run TensorFlow Transform

  • is that we actually run these analyze phases,

  • compute the results of those analyze phases,

  • and then inject the result as a constant in the TensorFlow graph--

  • so this is on the right-- and in this graph,

  • it's a hermetic TensorFlow graph that applies all the transformations,

  • and it can be in-lined in your serving graph.

  • So now your serving graph has the transform graph

  • as part of it and can play through all of these transforms