Placeholder Image

Subtitles section Play video

  • ♪ (music) ♪

  • Hello, everyone.

  • First, thanks everyone for coming to attend the Dev Summit.

  • And second, thanks for staying around this long.

  • I know it's been a very long day.

  • And there has been a lot of information that we've been throwing at you.

  • But we've got much, much more and many more announcements to come.

  • So please stick with me.

  • My name is Clemens, and this is Raz.

  • We're going to talk about TensorFlow Extended today.

  • But before we do this, I'm going to do a quick survey.

  • Can I get a quick show of hands?

  • How many of you do machine learning in a research or academic setting?

  • Okay.

  • Quite a big number.

  • Now how many of you do machine learning in a production setting?

  • Okay.

  • That looks about half-half.

  • Obviously, also a lot of overlap.

  • So for those of you who do machine learning in a production setting,

  • how many of you have agreed with this statement?

  • Yeah? Some? Okay.

  • I see a lot of hands coming up.

  • So everyone that I speak with who's doing machine learning

  • in production agrees with this statement:

  • "Doing machine learning in production is hard," and it's too hard.

  • Because after all, we actually want to democratize machine learning

  • and get more and more people to allow them to deploy machine learning

  • in their products.

  • One of the main reasons why it's still hard is because in addition

  • to the actual machine learning.

  • So this small orange box where you actually use TensorFlow,

  • you may use Keras to put together your layers

  • and train your model.

  • You need to worry about so much more.

  • There's all of these other things that you have to worry about

  • to actually deploy machine learning in a production setting

  • and serve it within your product.

  • Now the good news is that this is exactly

  • what TensorFlow Extended is about.

  • TFX in [inaudible] Google is an [inaudible] machine learning

  • platform that allows our developers to go all the way from data to production

  • and serving machine learning models

  • as fast as possible.

  • Now before we introduce TFX,

  • we saw that going through this process

  • of writing some of these components, some of them didn't exist before.

  • Gluing them together and actually getting to

  • a launch took anywhere between six to nine months,

  • sometimes even a year.

  • Once we've deployed TFX and allow developers to use it,

  • in many cases, people can use this platform and get up and running

  • with it in a day and actually get to a deployable model in production

  • in the order of weeks or in just a month.

  • Now, TFX is a very large system and platform that consists

  • of a lot of components and a lot of services

  • so unfortunately I can't talk about all of this in the next 25 minutes.

  • So we're only going to be able to cover a small part of it but we're talking

  • about the things that we've already open sourced and made available to you.

  • First, we're going to talk about TensorFlow Transform

  • and show you how to apply transformations on your data

  • consistently between training and serving.

  • Next, Raz is going to introduce you to a new product that we're open sourcing

  • called TensorFlow Model Analysis.

  • We're going to give a demo of how all of this works together end to end

  • and then make a broader announcement of our plans for TensorFlow Extended

  • and sharing it the community.

  • Let's jump into TensorFlow Transform first.

  • So, a typical ML pipeline that you may see in the wild

  • is during training,

  • you usually have a distributed data pipeline that applies transformations

  • to your data.

  • Because usually you train in a large amount of data,

  • this needs to be distributed,

  • and you're on this pipeline

  • and sometimes materialize the output before you actually

  • put it into your trainer.

  • Now at serving time,

  • we need to find a way to somehow replay those exact transformations online.

  • As a new request comes in, it needs to be sent to your model.

  • There's a couple of challenges with this.

  • The first one is, usually those two things are very different code paths.

  • The data distribution systems that you would use for batch processing

  • are very different from the libraries and tools that you would use to--

  • in real time transform data to make a request to your model.

  • Now we have two different code paths.

  • Second, in many cases, it's very hard to keep those two in sync.

  • I'm sure a lot of you have seen this.

  • You change your batch processing pipeline and introduce a new feature or change

  • how it behaves and you somehow need to make sure that the code

  • that they actually use in your production system is changed

  • at the same time and is kept in sync.

  • The third problem is, sometimes you actually want to deploy

  • your TensorFlow machine learning model in many different environments.

  • You want to deploy it in a mobile device; you want to deploy in a server;

  • maybe you want to put it on a car; now suddenly you have

  • three different environments where you have to apply

  • these transformations, and maybe there's different languages

  • that you use for those, and it's also very hard

  • to keep those in sync.

  • And this introduces something that we call training serving skew,

  • where the transformations that you do at training time may be different

  • from the ones in serving time, which usually leads to bad quality

  • of your serving model.

  • TensorFlow Transform addresses this by helping you write

  • your data processing job at training time,

  • so actually help you create those data pipelines to do those

  • transformations, and at the same time,

  • it emits a TensorFlow graph that can be

  • in line with your training model and also your serving model.

  • Now what this does is, it actually hermetically seals the model,

  • and your model takes a raw data request as input,

  • and all of the transformations are actually happening

  • within the TensorFlow graph.

  • This is a lot of advantages, one of them is that you no longer

  • have any code in your serving environment that does these

  • transformations because they're all being done in the TensorFlow graph.

  • Another one is wherever you deploy this TensorFlow model,

  • all of those transformations are applied in a consistent way.

  • No matter where this graph is being evaluated.

  • Let's see how that looks like.

  • This is a code snippet of a pre-processing function

  • that you would write with TF Transform.

  • I'm just going to walk you through what happens here

  • and what we need to do for this.

  • First thing we do is normalize this feature.

  • As all of you know, in order to re-normalize a feature

  • we need to compute the mean and the standard deviation,

  • and to actually apply this transformation, we need to subtract by the mean

  • and divide by the center of deviation.

  • So what has to happen is, for the input feature X,

  • we have to compute these statistics which is a trivial task.

  • If the data fits into a single machine, you can do it easily.

  • It's a non-trivial task if you have a gigantic training data set

  • and actually have to compute these metrics...

  • ...effectively.

  • Once we have these metrics we can actually apply this transformation

  • to the feature.

  • This is to show you that the output of this transformation can then be,

  • again, multiplied with another tensor--

  • which is just a regular TensorFlow transformation.

  • And then in order to bucketize a feature, you also again need to compute

  • the bucket boundaries to actually apply this transformation.

  • And again, this is a distributed data job to compute those metrics for the result

  • of an already transformed feature.

  • This is another benefit to then actually apply this transformation.

  • The next examples just show you that in the same function it can apply

  • any other tensor in tensor [inaudible] function and there's also some

  • of what we call mappers in TF transform that don't require this analyze phase.

  • So, N-grams doesn't require us to actually run a data pipeline

  • to compute anything.

  • Now what happens here is that these orange boxes

  • are what we call analyzers.

  • We realize those as actual data pipelines that compute those metrics over your data.

  • They're implemented using Apache Beam.

  • And we're going to talk about this more later.

  • But what this allows us to do is actually run this distributor data pipeline

  • in different environments.

  • There's different runners for Apache Beam.

  • And all of the transforms are just simple instance to instance transformations

  • using pure TensorFlow code.

  • What happens when you run TensorFlow Transform

  • is that we actually run these analyze phases,

  • compute the results of those analyze phases,

  • and then inject the result as a constant in the TensorFlow graph--

  • so this is on the right-- and in this graph,

  • it's a hermetic TensorFlow graph that applies all the transformations,

  • and it can be in-lined in your serving graph.

  • So now your serving graph has the transform graph

  • as part of it and can play through all of these transforms

  • wherever you want to deploy this TensorFlow model.

  • What can be done with TensorFlow Transform?

  • At training time for the batch processing, really anything that you can do

  • with a distributed data pipeline.

  • So there's a lot of flexibility here with types of statistics you can compute.

  • We provide a lot of utility functions for you,

  • but you can also write custom data pipelines.

  • And at serving time because we generate a TensorFlow graph that applies

  • these transformations-- we're limited to what you can do

  • with a TensorFlow graph, but for all of you who know TensorFlow,

  • there's a lot of flexibility in there as well.

  • Anything that you can do in a TensorFlow graph,

  • you can do with your transformations.

  • Some of the common use cases that we've seen, the ones on the left

  • I just spoke about, you can scale a continuous value to the C-score

  • which is minimalization or to a value between 0 and 1.

  • You can bucketize a continuous value.

  • If you have text features, you can apply Bag of Words or N-grams,

  • or for feature crosses, you can actually cross

  • those strings and then generate vocabs of the result of those crosses.

  • As mentioned before, TF Transform is extremely powerful

  • in actually being able to chain together these transforms so you can apply

  • transform under result of a transform and so on.

  • Another particular interesting transform is actually applying

  • another TensorFlow model.

  • You've heard about the saved model before?

  • If you have a saved model that you can apply as a transformation,

  • you can use this until you've transformed.

  • Let's say you have an image and you want to apply

  • an inception model as it transforms and then use the output of that

  • inception model maybe to combine it with some other feature

  • or use it as an input feature to your model.

  • You can use any other TensorFlow model

  • that ends up being in-lined in your transform graph

  • and also in-lined in your serving graph.

  • All of this is available today and you can go check it out

  • on github.com/tensorflow/transform.

  • With this I'm going to hand it over to Raz who's going to talk

  • about TensorFlow Model Analysis.

  • Alright, thanks Clemens.

  • Hi, everyone.

  • I'm really excited to talk about

  • TensorFlow Model Analysis today.

  • We're going to talk a little bit about metrics.

  • Let's see, next slide.

  • Alright, so we can already get metrics today right?

  • We use TensorBoard. TensorBoard's awesome.

  • You saw an earlier presentation today about TensorBoard.

  • It's a great tool-- while you're training,

  • you can watch your metics, right?

  • If your training isn't going well, you can save yourself

  • a couple of hours of your life, right?

  • Terminate the training, fix some things...

  • Let's say you have your trained model already.

  • Are we done with metics? Is that it?

  • Is there any more to be said about metics after we're done training?

  • Well, of course, there is.

  • We want to know how well our trained model actually does

  • for our target population.

  • I would argue that we want to do this in a distributed fashion

  • over the entire data set.

  • Why wouldn't we just sample?

  • Why wouldn't we just save more hours of our lives, right?

  • And just sample, make things fast and easy.

  • Let's say you start with a large data set.

  • Now you're going to slice that data set.

  • You're going to say, "I'm going to look at people at noon time."

  • Right? That's a feature.

  • >From Chicago, my hometown.

  • Running on this particular device.

  • Each of these slices reduce the size

  • of your evaluation dataset by a factor.

  • This is an exponential decline.

  • By the time you're looking at the experience for a particular...

  • ...set of users, you're not left with very much data.

  • And the error bars on your performance measures, they're huge.

  • I mean, how do you know that the noise doesn't exceed your signal

  • by that point, right?

  • So really you want to start with your larger dataset

  • before you start slicing.

  • Let's talk about a particular metric.

  • I'm not sure--

  • Who's heard of the ROC Curve?

  • It's kind of an unknown thing in machine learning these days.

  • Okay.

  • We have our ROC Curve, and I'm going to talk about a concept

  • that you may or may not be familiar with

  • which is ML Fairness.

  • So what is fairness?

  • Fairness is a complicated topic.

  • Fairness is basically how well does our machine learning model do

  • for different segments of our population, okay?

  • You don't just have one ROC Curve,

  • you have an ROC Curve for every segment.

  • You have an ROC Curve for every group of users.

  • Who here would run their business

  • based on their top line metrics?

  • No one! Right? That's crazy.

  • You have to slice your metrics; you have to go in and dive in

  • and find out how things are going so that lucky user,

  • that black curve on the top, great experience.

  • That unlucky user, the blue curve?

  • Not such a great experience.

  • When can our models be unfair to various users?

  • One instance is if you simply don't have a lot of data

  • from which to draw your inferences.

  • Right?

  • We use Stochastic optimizers,

  • and if we re-train the model,

  • it does something different every time, slightly.

  • You're going to get a high variance for some users just because

  • you don't have a lot of data there.

  • We may be incorporating data from multiple data sources.

  • Some data sources are more biased than others.

  • So some users just get the short end of the deal, right?

  • Whereas other users get the ideal experience.

  • Our labels could be wrong. Right?

  • All of these things can happen.

  • Here's TensorFlow Model Analysis.

  • You're looking here at the UI hosted within a Jupyter Notebook.

  • On the X-axis, we have our loss.

  • You can see there's some natural variance in the metrics.

  • We're not always going to get spot on the same precision

  • and recall for every segment of population.

  • But sometimes you'll see... what about those guys

  • at the top there experiencing the highest amount of loss?

  • Do they have something in common?

  • We want to know this.

  • Sometimes our users that...

  • ...get the poorest experience,

  • they're sometimes our most vocal users, right?

  • We all know this.

  • I'd like to invite you to come visit ml-fairness.com.

  • There's a deep literature about

  • the mathematical side of ML Fairness.

  • Once you've figured out how to measure fairness,

  • there's a deep literature about what to do about it.

  • How does TensorFlow Model Analysis actually give you these sliced metrics?

  • How did you go about getting these metrics?

  • Today you export a saved model for serving.

  • It's kind of a familiar thing.

  • TensorFlow Model Analysis is simple.

  • As it's simple, it's similar.

  • You export a saved model for evaluation.

  • Why are these models different? Why export two?

  • Well the eval graph that we serialize as a saved model

  • has some additional annotations

  • that allow our evaluation batch job

  • to find the features, to find the prediction, to find the label.

  • We don't want those things mixed in with our serving graphs

  • so you export a second one.

  • So this is the GitHub.

  • We just opened it, I think last night at 4.30 pm.

  • Check it out.

  • We've been using it internally for quite some time now.

  • Now it's available externally as well.

  • The GitHub has an example

  • that kind of puts it all together

  • so that you can try all these components that we're talking about

  • from your local machine.

  • You don't have to get an account anywhere.

  • You just get cloned and run the scripts

  • and run the code lab.

  • This is the Chicago Taxi Example.

  • So we're using public data from-- publicly available data

  • to determine which riders will tip their driver

  • and which riders, shall we say,

  • don't have enough money to tip today.

  • What does fairness mean in this context?

  • So our model is going to make some predictions.

  • We may want to slice these predictions by time of day.

  • During rush hour we're going to have a lot of data so hopefully

  • our model's going to be fair if that data is not biased.

  • At the very least it's not going to have a lot of variance.

  • But how's it going to do at 4 a.m. in the morning?

  • Maybe not so well.

  • How's it going to do when the bars close?

  • An interesting question.

  • I don't know yet, but I challenge you to find out.

  • So this is what you can run using your local scripts.

  • We start with our raw data.

  • We run the TF Transform; the TF Transform emits

  • a transform function and our transformed examples.

  • We train our model.

  • Our model, again, emits two saved models as we talked about.

  • One for serving and one for eval.

  • And we try this all locally, just run scripts and play with the stuff.

  • Clemens talked a little bit about transform.

  • Here we see that we want to take our dense features,

  • and we want to scale them to a particular Z-Score.

  • And we don't want to do that batch by batch

  • because the mean for each batch is going to differ,

  • and there's going to be fluctuations.

  • We may want to do that across the entire data set.

  • We may want to normalize these things across the entire data set.

  • We build a vocabulary; we bucket for the wide part of our model,

  • and we emit our transform function, and into the trainer we go.

  • You heard earlier today about TF Estimators,

  • and here is a wide and deep estimator

  • that takes our transformed features

  • and emits to saved models.

  • Now we're in TensorFlow Model Analysis,

  • which reads in the saved model

  • and runs it against all of the raw data.

  • We called render slicing metrics from the Jupyter Notebook,

  • and you see the UI.

  • The thing to notice here is that this UI is immersive, right?

  • It's not just a static picture that you can look at and go,

  • "Huh" and then walk away from.

  • It lets you see your errors broken down

  • by bucket or broken down by feature,

  • and it lets you drill in and ask questions

  • and be curious about how your models are actually treating various subsets

  • of your population.

  • Those subsets may be the lucrative subsets

  • you really want to drill in.

  • And then you want to serve your models so our demo--

  • our example has a one-liner here

  • that you can run to serve your model.

  • Make a client request--

  • the thing to notice here is that we're making

  • a GRPC request to that server.

  • We're taking our feature tensors, we're serializing them

  • into the GRPC request, sending them to the server

  • and back comes probability.

  • But that's not quite enough, right?

  • We've heard a little bit of feedback about this server.

  • The thing that we've heard is that GRPC is cool,

  • but REST is really cool.

  • I tried.

  • This is actually one of the top feature requests

  • on GitHub for model serving.

  • You can now pack your tensors into a JSON object,

  • send that JSON object to the server

  • and get a response back to [inaudible].

  • Much more convenience and I'm very excited to say

  • that it'll be released very soon.

  • Very soon.

  • I see the excitement out there.

  • Back to the end to end.

  • You can try all of these pieces end to end all on your local machine.

  • Because they're using Apache Beam direct runners, and direct runners

  • allow you to take your distributive job and to run them all locally.

  • Now if you swap in Apache Beam's data flow runner,

  • you can now run against the entire data set in the cloud.

  • The example also shows you how to run the big job

  • against the cloud version as well.

  • We're currently working with a community to develop

  • a runner for Apache Flink, a runner for Spark.

  • Stay tuned to the TensorFlow blog

  • and to our GitHub...

  • ...and you can find the example at tensorflow/model-analysis

  • and back to Clemens.

  • Thank you, Raz.

  • (applause)

  • Alright, so we've heard about Transform.

  • We've heard how to train models, how to use model analysis

  • and how to serve them.

  • But I hear you say you want more.

  • Right? Is that enough?

  • You want more? Alright.

  • You want more.

  • And I can think of why you want more.

  • Maybe you read the paper we published last year and presented

  • at KDD about TensorFlow Extended.

  • In this paper we laid out this broad vision of how

  • this platform works within Google and all of the features that it has

  • and all the impact that we have by using it.

  • Figure one, which allows these boxes and describes

  • what TensorFLow Extended actually is.

  • Although, overly simplified, this is still much more

  • than we've discussed today.

  • Today, we spoke about these four components

  • of TensorFlow Extended.

  • Now it's important to highlight that this is not yet an end to end

  • machine learning platform.

  • This is just a very small piece of TFX.

  • These are the libraries that we've open-sourced

  • for you to use.

  • But we haven't yet released the entire platform.

  • We're working very hard on this because we've seen

  • the profound impact that it had internally--

  • how people could start using this platform

  • into applying machine learning in production using TFX.

  • And we've been working very hard to actually make

  • more of these components available to you.

  • So in the next phase, we're actually looking into our data components

  • and looking to make those available to users

  • that you can analyze your data, visualize the distributions,

  • and detect anomalies because it's an important part

  • of any machine learning pipeline

  • to detect changes and shifts in your data and anomalies.

  • After this we're actually looking into some of the horizontal pieces

  • that helped tie all of these components together

  • because if they're only single libraries, you still have

  • to glue them together yourself.

  • You still have to use them individually.

  • They have well-defined interfaces, but you still have to combine them

  • by yourself.

  • Internally we have a shared configuration framework that allows you

  • to configure the entire pipeline and a nice integrated fountain

  • that allows you to monitor the status of these pipelines

  • and see progress and inspect the different artifacts

  • that have been produced by all of the components.

  • So this is something that we're also looking to release

  • later this year.

  • And I think you get the idea.

  • Eventually we want to make all of this available to the community

  • because internally, hundreds of teams use this

  • to improve our product.

  • We really believe that this will be as transformative

  • to the community as it is at Google.

  • And we're working very hard to release more of these technologies

  • into the entire platform to see what you can do

  • with them for your products and for your companies.

  • Keep watching the TensorFlow blog posts for a more detailed announcement

  • about TFX and our future plans.

  • And as mentioned, you can already use

  • some of these components today.

  • Transform is released.

  • Model Analysis was just released yesterday,

  • Serving is also released,

  • and the end-to-end example is available

  • under the shortlink and you can find it on the model analysis [inaudible].

  • So with this, thank you from both myself and Raz,

  • and I'm going to ask you to join me in welcoming

  • a special external guest, Patrick Brand, who's joining us from Coca-Cola,

  • who's going to talk about applied AI at Coca-Cola.

  • Thank you.

  • (applause)

  • ♪ (music) ♪

♪ (music) ♪

Subtitles and vocabulary

Click the word to look it up Click the word to find further inforamtion about it