Subtitles section Play video
ROBERT CROWE: I'm Robert Crowe.
And we are here today to talk about production pipelines, ML
pipelines.
So we're not going to be talking about ML modeling
too much or different architectures.
This is really all focused about when you have a model
and you want to put it into production so that you can
offer a product or a service or some internal service
within your company, and it's something
that you need to maintain over the lifetime
of that deployment.
So normally when we think about ML,
we think about modeling code, because it's
the heart of what we do.
Modeling and the results that we get from the amazing models
that we're producing these days, that's
the reason we're all here, the results we can produce.
It's what papers are written about, for the most part,
overwhelmingly.
The majority are written about architectures and results
and different approaches to doing ML.
It's great stuff.
I love it.
I'm sure you do too.
But when you move to putting something into production,
you discover that there are a lot of other pieces
that are very important to making that model that you
spent a lot of time putting together
available and robust over the lifetime of a product
or a service that you're going to offer out to the world
so that they can experience really
the benefits of the model that you've worked on.
And those pieces are what TFX is all about.
In machine learning, we're familiar with a lot
of the issues that we have to deal with,
things like where do I get labeled data.
How do I generate the labels for the data that I have.
I may have terabytes of data, but I need labels for them?
Does my label cover the feature space
that I'm going to see when I actually
run inference against it?
Is my dimensionality-- is it minimized?
Or can I do more to try to simplify
my set, my feature vector, to make my model more efficient?
Have I got really the predictive information in the data
that I'm choosing?
And then we need to think about fairness as well.
Are we are we serving all of the customers
that we're trying to serve fairly, no matter where they
are, or what religion they are, what language they speak,
what demographic they might be because you
want to serve those people as well as you can?
You don't want to unfairly disadvantage people.
And we may have rare conditions too, especially in things
like health care where we're making
a prediction that's going to be pretty important
to someone's life.
And it maybe on a condition that occurs very rarely.
But a big one when you go into production
is understanding the data lifecycle.
Because once you've gone through that initial training
and you've put something into production,
that's just the start of the process.
You're now going to try to maintain that over a lifetime,
and the world changes.
Your data changes.
Conditions in your domain change.
Along with that, you're doing now production software
deployment.
So you have all of the normal things
that you have to deal with any software deployment, things
like scalability.
Will I need to scale up?
Is my solution ready to do that?
Can I extend it?
Is it something that I can build on?
Modularity, best practices, testability.
How do I test an ML solution?
And security and safety, because we
know there are attacks for ML models
that are getting pretty sophisticated these days.
Google created TFX for us to use.
We created it because we needed it.
It was not the first production ML framework that we developed.
We've actually learned over many years
because we have ML all over Google
taking in billions of inference requests
really on a planet scale.
And we needed something that would
be maintainable and usable at a very large production
scale with large data sets and large loads over a lifetime.
So TFX has evolved from earlier attempts.
And it is now what most of the products and services at Google
use.
And now we're also making it available to the world
as an open-source product available to you now
to use for your production deployments.
It's also used by several of our partners
and just companies that have adopted TFX.
You may have heard talks from some of these at the conference
already.
And there's a nice quote there from Twitter,
where they did an evaluation.
They were coming from a Torch-based environment,
looked at the whole suite or the whole ecosystem of TensorFlow,
and moved everything that they did to TensorFlow.
One of the big contributors to that
was the availability of TFX.
The vision is to provide a platform for everyone to use.
Along with that, there's some best practices and approaches
that we're trying to really make popular in the world, things
like strongly-typed artifacts so that when
your different components produce artifacts
they have a strong type.
Pipeline configuration, workflow execution,
being able to deploy on different platforms,
different distributed pipeline platforms using
different orchestrators, different underlying execution
engines--
trying to make that as flexible as possible.
There are some horizontal layers that
tie together the different components in TFX.
And we'll talk about components here in a little bit.
And we have a demo as well that will show you some of the code
and some of the components that we're talking about.
The horizontal layers-- an important one there is metadata
storage .
So each of the components produce and consume artifacts.
You want to be able to store those.
And you may want to do comparisons across months
or years to see how did things change, because change becomes
a central theme of what you're going to do in a production
deployment.
This is a conceptual look at the different parts of TFX.
On the top, we have tasks--
a conceptual look at tasks.
So things like ingesting data or training a model
or serving the model.
Below that, we have libraries that are available, again,
as open-source components that you can leverage.
They're leveraged by the components within TFX
to do much of what they do.
And on the bottom row in orange, and a good color for Halloween,
we have the TFX components.
And we're going to get into some detail about how your data will
flow through the TFX pipeline to go from ingesting data
to a finished trained model on the other side.
So what is a component?
A component has three parts.
This is a particular component, but it could be any of them.
Two of those parts, the driver and publisher,
are largely boilerplate code that you could change.
You probably won't.
A driver consumes artifacts and begins the execution
of your component.
A publisher takes the output from the component,
puts it back into metadata.
The executor is really where the work is
done in each of the components.
And that's also a part that you can change.
So you can take an existing component,
override the executor in it, and produce
a completely different component that
does completely different processing.
Each of the components has a configuration.
And for TFX, that configuration is written in Python.
And it's usually fairly simple.
Some of the components are a little more complex.
But most of them are just a couple of lines of code
to configure.
The key essential aspect here that I've alluded to
is that there is a metadata store.
The component will pull data from that store
as it becomes available.
So there's a set of dependencies that determine which artifacts
that component depends on.
It'll do whatever it's going to do.
And it's going to write the result back into metadata.
Over the lifetime of a model deployment,
you start to build a metadata store that
is a record of the entire lifetime of your model.
And the way that your data has changed,
the way your model has changed, the way your metrics have
changed, it becomes a very powerful tool.
Components communicate through the metadata store.
So an initial component will produce an artifact,
put it in the metadata store.
The components that depend on that artifact
will then read from the metadata store
and do whatever they're going to do,
and put their result into it, and so on.
And that's how we flow through the pipeline.
So the metadata store I keep talking about.
What is it?
What does it contain?
There's really three kinds of things that it contains.
Trained models or just artifacts themselves.
They could be trained models, they could be data sets,
they could be metrics, they could be splits.
There's a number of different types of objects
that are in the metadata store.
Those are grouped into execution records.
So when you execute the pipeline,
that becomes an execution run.
And the artifacts that are associated with that run
are grouped under that execution run.
So again, when you're trying to analyze what's
been happening with your pipeline,
that becomes very important.
Also, the lineage of those artifacts--
so which artifact was produced by which component,
which consumed which inputs, and so on.
So that gives us some functionality
that becomes very powerful over the lifetime of a model.
You can find out which data a model
was trained on, for example.
If you're comparing the results of two different model
trainings that you've done, tracing it
back to how the data changed can be really important.
And we have some tools that allow you to do that.
So TensorBoard for example will allow
you to compare the metrics from say a model that you trained
six months ago and the model that you just trained
now to try to understand.
I mean, you could see that it was different, buy why--
why was it different.
And warm-starting becomes very powerful too,
especially when you're dealing with large amounts of data that
could take hours or days to process,
being able to pull that data from cache.
If the inputs haven't changed, rather than rerunning
that component every time becomes a very powerful tool
as well.
So there's a set of standard components
that are shipped with TFX.
But I want you to be aware from the start
that you are not limited to those standard components.
This is a good place to start.
It'll get you pretty far down the road.
But you will probably have needs-- you may or may not--
where you need to extend the components that are available.
And you can do that.
You can do that in a couple of different ways.
This is sort of the canonical pipeline that we talk about so
on the left, we're ingesting our data.
We flow through, we split our data,
we calculate some statistics against it.
And we'll talk about this in some detail.
We then make sure that we don't have problems with our data,
and try to understand what types our features are.
We do some feature engineering, we train.
This probably sounds familiar.
If you've ever been through an ML development process,
this is mirroring exactly what you always do.
Then you're going to check your metrics across that.
And we do some deep analysis of the metrics of our model
because that becomes very important.
And we'll talk about an example of that in a little bit.
And then you have a decision because let's assume
you already have a model in production
and you're retraining it.
Or maybe you have a new model that you're training
and the question becomes, should I
push this new model to production,
or is the one I already have better.
Many of you have probably had the experience,
if you trained a new model and it actually didn't do
as well as the old one did.
Along with that, we also have the ability
to do bulk inference on inference requests.
So you may be in a batch request environment.
So you're pulling in data in batches
and you're running requests against it,
and then taking that result and doing something
with it, that's a very common use case.
This is actually kind of new.
We have components to do that as well.
This is the Python framework of defining a pipeline.
So that's a particular component.
It's the transform component and the configuration
that you need to use for that.
But on a very simple level, that's
how you set up components.
And in the bottom, you can see there's a list of components
that are returned in that call.
Those are going to be passed to a runner that
runs on top of whatever orchestrator you're using.
So a little bit more complex example,
a little bit harder to read.
Gives you an idea there's several components there.
The dependencies between components and the dependencies
and artifacts that I just talked about
are defined in code like that.
So you'll see, for example, that StatisticsGen depends
on the output of ExampleGen.
So now let's talk about each of the standard components.
ExampleGen is where you ingest your data.
And it is going to take your data,
it's going to convert it to TensorFlow examples.
This is actually showing just two input formats,
but there's a reasonably long list of input formats
that you can have.
And it'll also do your splits for you.
So you may want just training and eval,
or maybe you want a validation split as well.
So that's what ExampleGen does.
And then it passes the result onto StatisticsGen.
StatisticsGen, because we all work with data,
we know you need to dive into the data
and make sure that you understand the characteristics
of your data set.
Well, StatisticsGen is all about doing that in an environment
where you may be running that many times a day.
It also gives you tools to do visualization of your data
like this.
So for example, that's the trip start hour feature
for this particular data set.
And just looking at that, just looking at the histogram,
tells me a lot about an area that I need to focus on.
The 6 o'clock hour, I have very little data.
So I want to go out and get some more data.
Because if I try to use that and I run inference requests
at 6:00 AM, it's going to be overgeneralizing.
So I don't want that.
SchemaGen is looking at the types of your features.
So it's trying to decide is it a float, is it an int, is it
a categorical feature.
And if it's a categorical feature,
what are the valid categories?
So SchemaGen tries to infer that.
But you as a data scientist need to make sure
that it did the job correctly.
So you need to review that and make any fixes
that you need to make.
ExampleValidator then takes those two results,
the SchemaGen and StatisticsGen, and it looks
for problems with your data.
So it's going to look for things like missing values,
values that are zero and shouldn't
be zero, categorical values that are really
outside the domain of that category,
things like that, problems in your data.
Transform is where we do feature engineering.
And Transform is one of the more complex components.
As you can see from the code there,
that could be arbitrarily complex
because, depending on the needs of your data
set and your model, you may have a lot of feature engineering
that you need to do, or you may just have a little bit.
The configuration for it is fairly standard.
Just one line of code there with some configuration parameters.
But it has a key advantage in that it's
going to take your feature engineering
and it's going to convert it into a TensorFlow graph.
That graph then gets prepended to the model
that you're training as the input stage to your model.
And what that does is it means you're doing the same feature
engineering with the same code exactly the same way,
both in training and in production
when you deploy to any of the deployment targets.
So that eliminates the possibility
that you may have run into where you
have two different environments, maybe even
two different languages and you're
trying to do the same thing in both places
and you hope it's correct.
This eliminates that.
We call it training serving skew.
It eliminates that possibility.
Trainer.
Well, now we're coming back to the start.
Trainer does what we started with.
It's going to train a model for us.
So this is TensorFlow.
And the result is going to be a SavedModel,
and a little variant of the SavedModel, the eval SaveModel
that we're going to use.
It has a little extra information that we're
going to use for evaluation.
So Trainer has the typical kinds of configuration
that you might expect, things like the number of steps,
whether or not to use warm-starting.
And you can use TensorBoard, including
comparing execution runs between the model that you just trained
and models that you've trained in the past at some time.
So TensorBoard has a lot of very powerful tools
to help you understand your training process
and the performance of your model.
So here's an example where we're comparing
two different execution runs.
Evaluator uses TensorFlow Model Analysis, one of the libraries
that we talked about at the beginning,
to do some deep analysis of the performance of your data.
So it's not just looking at the top level metrics,
like what is the RMSE or the AUC for my whole data set.
It's looking at individual slices of your data set
and slices of the features within your data set
to really dive in at a deeper level
and understand the performance.
So that things like fairness become
very manageable by doing that.
If you don't do that sort of analysis,
you can easily have gaps that may
be catastrophic in the performance of your model.
So this becomes a very powerful tool.
And there's some visualization tools
that we'll look at as well that help you do that.
ModelValidator asks that question
that I talked about a little while
ago where you have a model that's in production,
you have this new model that you just trained.
Is it better or worse than what I already have?
Should I push this thing to production?
And if you decide that you're going to push it to production,
then Pusher does that push.
Now production could be a number of. different things.
You could be pushing it to a serving cluster using
TensorFlow Serving.
You could be pushing it to a mobile application using
TensorFlow Lite.
You could be pushing it to a web application or a Node.js
application using TensorFlow.js.
Or you could even just be taking that model
and pushing it into a repo with TensorFlow Hub
that you might use later for transfer learning.
So there's a number of different deployment targets.
And you can do all the above with Pusher.
Bulk infer is that component that we
talked about a little while ago where
we're able to take bulk inference requests
and run inference across them, and do that in a managed
way that allows us to take that result and move it off.
All right, orchestrating.
We have a number of tasks in our pipeline.
How do we orchestrate them?
Well, there's different ways to approach it.
You can do task-aware pipelines where you simply run a task,
you wait for it to finish, and you run the next task.
And that's fine.
That works.
But it doesn't have a lot of the advantages
that you can have with a task or data-aware pipeline.
This is where we get our metadata.
So by setting up the dependencies
between our components and our metadata artifacts
in a task and data-aware pipeline,
we're able to take advantage of a lot of the information
over the lifetime of that product or service
that that ML deployment that we have in the artifacts
that we've produced.
Orchestration is done through an orchestrator.
And the question is, which orchestrator
do you have to use.
Well, the answer is you can use whatever you want to use.
We have three orchestrators that are supported out of the box--
Apache Airflow, Kubeflow, for a Kubernetes
containerized environment, and Apache Beam.
Those three are not your only selections.
You can extend that to add your own orchestration.
But in the end, you're going to end up
with essentially the same thing, regardless
of which orchestrator you're going to use.
You're going to end up with a Direct Acyclic Graph, or DAG,
that expresses the dependencies between your components, which
are really a result of the artifacts that are produced
by your components.
So here's three examples.
They look different, but actually if you look at them,
they are the same DAG.
We get this question a lot, so I want to address this.
What's this cube thing and this TFX thing,
and what's the difference between the two?
The answer is Kubeflow is really focused on a Kubernetes
containerized environment.
And it's a great deployment platform
for running in a very scalable manageable way.
Kubernetes pipelines uses TFX.
So you're essentially deploying TFX in a pipeline
environment on Kubernetes.
And that's Kubeflow pipeline.
But TFX can be deployed in other ways as well.
So if you don't want to use Kubeflow pipelines,
if you want to deploy in a different environment, maybe
on-prem in your own data center or what have you,
you can use TFX in other environments as well.
One of the things that we do, because we're
working with large data sets, and there's
a lot of processing involved, some
of these operations that we're going
to do require a lot of processing,
we need to distribute that processing over a pipeline.
So how do we do that?
Well, a component that uses a pipeline
is going to create a pipeline for the operations
that it wants to do.
It's going to hand that pipeline off to a cluster.
Could be a Spark cluster, could be a Flink cluster,
could be Cloud Dataflow.
Map reduce happens on the cluster.
And it comes back with a result. But we
want to support more than just one or two types of distributed
pipelines.
So we're working with Apache Beam
to add an abstraction from the native layer of those pipelines
so that you can take the same code
and run it actually on different pipelines
without changing your code.
And that's what Apache Beam is.
There's different runners for different things
like Flink and Spark.
There's also a really nice one for development
called the direct runner or local runner that allow you
to run even just on a laptop.
But here's the vision for Beam.
There's a whole set of pipelines out there that are available.
They have strengths and weaknesses.
And in a lot of cases, you will already
have one that you've stood up and you
want to try to leverage that resource.
So by supporting all of them, you're
able to do this installation.
You don't have to spin up a completely different cluster
to do that.
You can leverage the ones or just expand the ones
that you already have.
So Beam allows you to do that, and also
with different languages.
Now in this case, we're only using Python.
But Beam as a vision, as an Apache project,
allows you to work with other languages
as well through different STKs.
And now I'd like to introduce Charles Chen, who's
going to give us a demo of TFX running
on actually just a laptop system in the cloud here.
So you'll get to see some live code.
CHARLES CHEN: Thank you, Robert.
So now that we've gone into detail about TFX and TFX
components, I'd like to take the chance
to make this really concrete with a live demo
of a complete TFX pipeline.
So this demo uses the new experimental TFX notebook
integration.
And the goal of this integration is
to make it easy to interactively build up
TFX pipelines in a Jupyter or Google Colab notebook
environment.
So you can try your pipeline out before you export the code
and deploy it into production.
You can follow along and run this yourself
in the Google Colab notebook at this link here.
So for the interactive notebook, we introduced one new concept.
This is the interactive context.
In a production environment, like Robert said,
we would construct a complete graph of pipeline components
and orchestrated on an engine like Airflow or Kubeflow.
By contrast, when you're experimenting in a notebook,
you want to interactively execute and see results
for individual components.
We construct an interactive context
which can do two things.
The first thing is it can run the component we define.
This as context.run.
And the second is it can show a visualization of the components
output.
This is context.show.
Let's get started.
Here's the overview we've seen of a canonical TFX
pipeline where we go from data ingestion to data
validation, feature engineering, model training and model
validation to deployment.
We'll go through each of these in the notebook.
Here's the notebook here.
And the first thing we do is a bit of setup.
This is the PIP install step.
And basically, we run the PIP Install
to install the Python package and all its dependencies.
And if you're following along, after doing the installation,
you need to restart the runtime so that the notebook picks up
some of the new versions of dependencies.
Next, we do some imports, set up some paths, download the data.
And finally, we create the interactive context.
Once we've done this, we get to our first component,
which is ExampleGen, which ingests data into the pipeline.
Again, this is just a couple of lines of code.
And we run this with context.run.
After we're done, we're ready to use
the data, which has been ingested and processed
into splits.
We take the output of ExampleGen and use
this in our next component, StatisticsGen,
which analyzes data and outputs detailed statistics.
This component can also be used standalone outside of a TFX
pipeline with the TensorFlow data validation package.
You can see that our input here is the output of ExampleGen
again.
And after we're done, we can visualize
this with context.show.
For each split, we get a detailed summary statistics
and a visualization of our data, which
we can dig into to ensure data quality,
even before we train a model.
After that, we can use SchemaGen to infer a suggested
schema for your data.
We see that visualized here.
This includes the type and domain
for each feature in your data set.
For example, for categorical features,
the domain is inferred to be all the values we've seen so far.
This is just the starting point.
And you can then edit and curate the schema
based on your domain knowledge.
Once we have the schema, we can use the ExampleValidator
to perform anomaly detection.
That is, find items in your input data
that don't match your expected schema.
This is especially useful as your pipeline evolves over time
with new data sets coming in.
We visualize this.
And any unexpected values are highlighted.
If you see anomalies, you might want
to either update your schema or fix your data collection
process.
After our data is validated, we move onto data transformation
or feature engineering.
This is done in the Transform component.
The first thing we do is write a little common code
and pre-processing function with TensorFlow Transform.
You can look at this code in more detail on your own.
It defines the transformations we
do using TensorFlow Transform.
And this means for each feature of your data,
we define the individual transformations.
This comes together in the Transform component
where feature engineering is performed.
And we output the transform graph and the engineered
features.
This will take a bit of time to run.
And after that's done, we get to the heart
of the model with the trainer.
Here, we define a training function that
returns a TensorFlow estimator.
We build the estimator and return this function
from the function.
And this is just TensorFlow.
So once we have this, we run the Trainer component,
which is going to produce a trained model for evaluation
and serving.
You can watch it train here.
It'll give you the loss, evaluate, and then produce
a SavedModel for evaluation and serving.
After we've trained the model, we
have the Evaluator component.
This uses the standalone TensorFlow Model Analysis
library.
In addition to overall metrics over the entire data set,
we can define more granular feature column
slices for evaluation.
The Evaluator component then computes
metrics for each data slice, which
you can then visualize with interactive visualizations.
What makes TensorFlow Model Analysis really powerful
is that in addition to the overall metrics we see here,
we can analyze model performance on granular feature slices--
granular feature column slices, that is.
So here we see the metrics rendered
across one slice of our data.
And we can do even more granular things
with multiple columns of our data.
After we've evaluated our model, we come to the ModelValidator.
Based on a comparison of the performance
of your model compared to an existing baseline model,
this component checks for whether or not
the model is ready to push to production.
Right now, since we don't have any existing models,
this check will by default return true.
You can also customize this check
by extending the ModelValidator executor.
The output of this check is then used in the next component,
the Pusher.
The Pusher pushes your model, again,
to a specific destination for production.
This can be TensorFlow Serving, a file system destination
or a cloud service like Google Cloud Platform.
Here, we configure the Pusher to write the model
to a file system directory.
Once we've done this, we've now architected a complete TFX
pipeline.
With minimal modifications, you can use your new pipeline
in production in something like Airflow or Kubeflow.
And essentially, this would be getting
rid of your usage of the interactive context
and creating a new pipeline object to run.
For convenience, we've included a pipeline export feature
in the notebook that tries to do this for you.
So here, well, first we do some housekeeping--
we mount Google Drive.
We select a runner type.
Let's say I want to run this on Airflow.
We set up some paths.
We specify the components of the exported pipeline.
We do the pipeline export.
And finally, we use this export cell
to get a zip archive of all you need
to run the pipeline on an engine like Airflow.
With that, I'll hand it back to Robert,
who will talk about how to extend TFX for your own needs.
ROBERT CROWE: Thanks, Charles.
All right, so that was the notebook.
Let me just advance here.
All right, custom components.
So again, these are the standard components
that come out of the box.
With TFX, but you are not limited to those.
You can write your own custom components.
So let's talk about how to do that.
First of all, you can do a semi-custom component
by taking an existing component and working
with the same inputs, the same outputs, essentially
the same contract, but replacing the executor by just overriding
the existing executor.
And that executor, remember, is where the work is done.
So changing that executor is going
to change that component in a very fundamental way.
So if you're going to do that, you're
going to extend the base executor
and implement a do function.
And that's what the code looks like.
There's a custom config dictionary
that allows you to pass additional things
into your component.
So this is a fairly easy and powerful way
to create your own custom component.
But you can also, if you want--
and this is how you would fit a custom component
into an existing pipeline-- it just fits in
like any other component.
You can also, though, do a fully custom component
where you have a different component
spec, a different contract, different inputs,
different outputs, that don't exist in an existing component.
And those are defined in the component spec that gives you
the parameters and the inputs and the outputs
to your component.
And then you are going to need an executor for that
as well, just like you did before.
But it takes you even further.
So your executor, your inputs, your outputs, that's
a fully custom component.
Now I've only got three minutes left.
I'm going to go through a quick example of really trying
to understand why model understanding and model
performance are very important.
First of all, I've talked about data
lifecycle a couple of times, trying to understand
how things change over time.
The ground truth may change.
Your data characteristics, the distribution
of each of your features may change.
Conditions in the world may change.
You may have different competitors.
You may expand into different markets, different geographies.
Styles may change.
The world changes.
Over the life of your deployment,
that becomes very important.
So in this example, and this is a hypothetical example,
this company is an online retailer who is selling shoes.
And they're trying to use click-through rates
to decide how much inventory they should order.
And they discover that suddenly--
they've been going along and now on a particular slice
of their data--
not their whole data set, just a slice of it--
things have really gone south.
So they've got a problem.
What do they do?
Well, first of all, it's important to understand
the realities around this.
Mispredictions do not have uniform cost.
Across your business or whatever service you're providing,
different parts of that will have different costs.
The data you have is never the data that you wish you had.
And the model, as in this case, the model objective
is often a proxy for what you really want to know.
So they're trying to use click-through rates as a proxy
for ordering inventory.
But the last one, the real world doesn't stand still,
is the key here.
You need to really understand that
when you go into a production environment.
So what can they do?
Well, their problems are not with the data set that they
use to train their model.
Their problems are with the current inference requests
that they're getting.
And there's a difference between those two.
So how do they deal with that?
Well, they're going to need labels.
Assuming they're doing supervised learning,
they're going to need to label those inference requests
somehow.
How can they do that?
If they're in the fortunate position
of being able to get direct feedback,
they can use their existing processes to label that data.
So for example, if they're trying
to predict the click-through rate
and they have click-through rate data that they're collecting,
they can use that directly.
That's great.
Many people are not in that situation.
So you see a lot of environments where
you're trying to use things like semi-supervision and humans
to label the data that you have where
a subset of the data that you have so you can try
to understand how things have changed since you
trained your model.
Weak supervision is a very powerful tool as well.
But it's not that easy to use in a lot of cases.
You need to try to apply historical data or other types
of modeling, heuristics.
And in many cases, those are giving you a labeling signal
that is not 100% accurate.
But it gives you some direction and you can work.
There are modeling techniques to work
with that kind of a signal.
TensorFlow Model Analysis.
The Fairness indicators that you may have seen today--
and we're out on the show floor with--
those are great tools to try to understand this and identify
the slices and the problems that you have with your data.
But first things first, you need to check your data,
look for outliers, check your feature space coverage.
How well does your data cover the feature space
that you have?
And use the tools that we give you
in TensorFlow Data Validation and TensorFlow Model Analysis.
We also have the what-if tool, a very powerful tool
for doing exploration of your data and your model.
And in the end, you need to quantify the cost because you
are never going to get 100%.
So how much is that extra 5% worth?
In a business environment, you need to understand that.
And that's TFX.
TFX is, again, what Google built because we needed it.
And now, we want you to have it too.
So it's available is an open source platform that we
encourage you all to build on.
And since it's open source, we hope
you're going to help us contribute and build
the platform to make it better in the future as well.
So on behalf of myself and my colleague
Charles, thanks very much.
[APPLAUSE]