Subtitles section Play video
-
[MUSIC PLAYING]
-
EWA MATEJSKA: Hi, everyone.
-
Thank you for joining us.
-
I'm Ewa Matejska, and I'm technical program manager
-
on the TensorFlow team.
-
ZHITAO LI: Hi, my name is Zhitao.
-
I'm a software engineer from Google's TensorFlow Extended
-
team, TFX.
-
EWA MATEJSKA: Today, we'll be talking
-
about the brand new feature of the addition of native Kera
-
model support through TFX pipelines.
-
So could you tell me what's TFX, what are TFX pipelines,
-
and what is native Keras model support?
-
ZHITAO LI: Happy to do that.
-
TFX is Google's production-ready machine learning platform.
-
TFX pipelines is something we released the last year
-
to bring the pipeline experience to open source users,
-
as well as the Google Cloud users.
-
And the native Keras support is something
-
we started working from last October
-
to making sure our TensorFlow 2 users
-
can use the native Keras API inside TFX
-
to train their machine learning models.
-
EWA MATEJSKA: What can I do with TFX pipelines?
-
ZHITAO LI: So you can ingest data into TFX,
-
do data processing and the data understanding
-
to feature engineering on top of your data,
-
train the TensorFlow model, do model analysis,
-
and the model validation on your model,
-
and then finally, when everything is ready,
-
push the model onto production-ready [INAUDIBLE]
-
solutions.
-
EWA MATEJSKA: Awesome.
-
I'm excited to see the native Keras support.
-
So what do I do?
-
ZHITAO LI: Let me show that in this notebook.
-
So this is a public notebook from TFX team
-
to demonstrate how to use various components in TFX.
-
This notebook's also retaining native Keras.
-
I'm going to show how to do it that way.
-
So to do that, we first go-- we first
-
need to install TFX and the various softwares,
-
including TensorFlow and TensorBoard.
-
We're making sure all the packages are preloaded
-
and then making sure the version of software is correct.
-
After that, we set our pipeline path
-
to making sure we can correctly access all the data we need.
-
EWA MATEJSKA: MK, and what kind of model will you be using?
-
What kind of data?
-
ZHITAO LI: So the data set here is the public data--
-
public taxi data set from Chicago city.
-
And the problem they're going to solve
-
is try to predict whether the driver will receive
-
a tape more than 20% of the fare, which we call it
-
[INAUDIBLE].
-
So we are going to download the example data to the path,
-
making sure the data here is loadable.
-
Check the first couple of lines.
-
Then we create the interactive context,
-
helping us to be able to run each component of TFX pipelines
-
in the notebook.
-
EWA MATEJSKA: Is Interactive context a new API?
-
ZHITAO LI: Interactive context is an API from last October.
-
This can help us to run each component of the TFX pipeline
-
in a notebook.
-
So we first start with the ExampleGen.
-
This ingests the data into the pipeline
-
and transform them to a [INAUDIBLE] examples.
-
We can check the first couple of examples,
-
making sure they're correct.
-
Then we can use the StatisticsGen component
-
to generate some statistics for the data.
-
EWA MATEJSKA: Can you tell me a little more
-
about the statistics?
-
ZHITAO LI: Sure.
-
The statistics tell us, for each of the features in the data
-
set, what's the distribution?
-
How many [INAUDIBLE] records are there?
-
Minimum value, maximum value, medium value, et cetera,
-
et cetera.
-
EWA MATEJSKA: OK, cool.
-
ZHITAO LI: And we can also generate a schema out
-
of the data, which will tell us, on the aggregated view, what
-
the data is really--
-
what the data looks like.
-
And we can se-- we can list out all the schemas from here.
-
We can also use the example validator
-
to making sure the data is correct.
-
Now, we can use transform to do feature engineering
-
top of our existing data.
-
To do that, people simply write a pre-processing function,
-
which takes the raw--
-
which takes the original inputs and then
-
using Python functions to define the transform on them.
-
And we can easy capture all these transforms in the result.
-
Now, to support native Keras, we need
-
to-- we ask users to write their TensorFlow training codes as
-
if they're just writing the-- writing the Keras [? space ?]
-
code in the normal environment.
-
The model type we are solving here
-
is a wide and a deep model.
-
We simply ask people to write their training code.
-
This is a--
-
EWA MATEJSKA: Wide and deep model, you said?
-
ZHITAO LI: Yes.
-
Build a Keras model.
-
People can build a wide and deep classifier.
-
And once this classifier is defined using the native Keras
-
API, they can rub that in the red function.
-
The red function will be then fed
-
into the TFX trainer executor.
-
And we expect the function to expand our saved model.
-
After that, we take off the training component.
-
And we can see the training happens.
-
EWA MATEJSKA: OK, awesome.
-
ZHITAO LI: The training really happened
-
in the Jupyter Notebook.
-
We see these are the features we are using.
-
These are the advanced features.
-
These are the layers we used in the model.
-
And we train them for 10,000 steps.
-
And then we exported model [INAUDIBLE]..
-
EWA MATEJSKA: So this is a lot of meaty content.
-
How can I follow along at home?
-
ZHITAO LI: Sure.
-
So feel free to check out the TensorFlow.org/TFX page.
-
That is our Home page.
-
We have all the tutorials, API docs,
-
as well as component guides available there.
-
And feel free to reach out to us on either GitHub or the TFX
-
Google Group.
-
EWA MATEJSKA: And I have one last question
-
for you, a high level question.
-
How do I take this out to production?
-
ZHITAO LI: Oh, sure.
-
Happy to do that.
-
So to do that, you can simply use the pusher component
-
to push the model onto various types of production-ready
-
serving solutions, including TensorFlow Serving,
-
off mobile devices using TensorFlow Lite or TensorFlow
-
Hub.
-
EWA MATEJSKA: Thank you so much for showing me
-
a little bit about the native [? Kera ?] model support.
-
And thank you for joining us.
-
ZHITAO LI: Thank you.
-
[MUSIC PLAYING]