Subtitles section Play video
NICK: Hi, everyone.
My name is Nick.
I am a engineer on the TensorBoard team.
And I'm here today to talk about TensorBoard and Summaries.
So first off, just an outline of what I'll be talking about.
First, I'll give an overview of TensorBoard, what it is
and how it works, just mostly sort of as background.
Then I'll talk for a bit about the tf.summary APIs.
In particular, how they've evolved from TF 1.x to TF 2.0.
And then finally, I'll talk a little bit
about the summary data format, log directories, event files,
some best practices and tips.
So let's go ahead and get started.
So TensorBoard-- hopefully, most of you
have heard of TensorBoard.
If you haven't, it's the visualization toolkit
for TensorFlow.
That's a picture of the web UI on the right.
Typically, you run this from the command line as the TensorBoard
command.
It prints out a URL.
You view it in your browser.
And from there on, you have a bunch of different controls
and visualizations.
And the sort of key selling point of TensorBoard
is that it provides cool visualizations out of the box,
without a lot of extra work.
you basically can just run it on your data
and get a bunch of different kinds of tools
and different sort of analyses you can do.
So let's dive into the parts of TensorBoard
from the user perspective a little bit.
First off, there's multiple dashboards.
So we have this sort of tabs setup
with dashboards across the top.
In the screenshot, it shows the scalers dashboard, which
is kind of the default one.
But there's also dashboards for images, histogram, graphs,
a whole bunch more are being added every month almost.
And one thing that many of the dashboards have in common
is this ability to sort of slice and dice
your data by run and by tag.
And a run, you can think of that as a sign
of a run of your TensorFlow program,
or your TensorFlow job.
And a tag corresponds to a specific named metric,
or a piece of summary data.
So here, the runs, we have a train
and evolve run on the lower left corner in the run selector.
And then we have different tags, including the cross [INAUDIBLE]
tag is the one being visualized.
And one more thing I'll mention is that one thing a lot
of TensorBoard emphasizes is seeing how
your data changes over time.
So most of the data takes the form of a time series.
And in this case, with the scalers dashboard,
the time series is sort of as a step count across the x-axis.
So we might ask, what's going on behind the scenes
to make this all come together?
And so here is our architecture diagram for TensorBoard.
We'll start over on the left with your TensorFlow job.
It writes data to disk using the tf.summary API.
And we'll talk both about the summary API and the event file
format a little more later.
Then the center component is TensorBoard itself.
We have a background thread that loads event file data.
And because the event file data itself
isn't efficient for querying, we construct a subsample
of the data and memory that we can query more efficiently.
And then the rest so TensorBoard is a web server that
has a plugin architecture.
So each dashboard on the frontend--
as a backend, it has a specific plugin
backend So for example, the scalers dashboard talks
to a scalers backend, images to an image backend.
And this allows the backends to do pre-processing or otherwise
structure the data in an appropriate way
for the frontend to display.
And then each plugin has a frontend dashboard component,
which are all compiled together by TensorBoard
and served as a single page and index.html.
And that page communicates back and forth through the backends
through standard HTTP requests.
And then finally, hopefully, we have our happy user
on the other end seeing their data,
analyzing it, getting useful insights.
And I'll talk a little more about just some details
about the frontend.
The front end is built on the Polymer web component
framework, where you define custom elements.
So the entirety of TensorBoard is one large custom element,
tf-tensorboard.
But that's just the top.
From there on, each plugin front end is--
each dashboard is its own frontend component.
For example, there's a tf-scaler dashboard.
And then all the way down to shared components
for more basic UI elements.
So we can think of this as a button, or a selector,
or a card element, or a collapsible pane.
And these components are shared across many of the dashboards.
And that's one of the key ways in which TensorBoard
achieves what is hopefully a somewhat uniform look
and feel from dashboard to dashboard.
The actual logic for these components
is implemented in JavaScript.
Some of that's actually TypeScript
that we compile to JavaScript.
Especially the more complicated visualizations,
TypeScript helps build them up as libraries
without having to worry about some of the pitfalls
you might get writing them in pure JavaScript.
And then the actual visualizations
are a mix of different implementations.
Many of them use Plottable, which
is a wrapper library over the D3, the standard JavaScript
visualization library.
Some of them use native D3.
And then for some of the more complex visualizations,
there are libraries that do some of the heavy lifting.
So the graph visualization, for example,
uses a directed graph library to do layout.
The projector uses a WebGL wrapper library
to do the 3D visualizations.
And the recently introduced What-If Tool plugin
uses the facets library from [INAUDIBLE] folks.
So we bring a whole bunch of different visualization
technologies together under one TensorBoard umbrella
is how you can think about the frontend.
So now that we have a overview of TensorBoard itself,
I'll talk about how your data actually gets to TensorBoard.
So how do you unlock all of this functionality?
And the spoiler announcement to that is the tf.summary API.
So to summarize the summary API, you
can think of it as structured logging for your model.
The goal is really to make it easy to instrument your model
code.
So to allow you to log metrics, weights,
details about predictions, input data, performance metrics,
pretty much anything that you might want to instrument.
And you can log these all, save them
to disk for later analysis.
And you won't necessarily always be calling the summary API
directly.
Some frameworks call the summary API for you.
So for examples, estimator has the summary saver hook.
Keras has a TensorBoard callback,
which takes care of some of the nitty gritty.
But underlying that is still the summary API.
So most data gets to TensorBoard in this way.
There are some exceptions.
Some dashboards have different data flows.
The debugger is a good example of this.
The debugger dashboard integrates with tfdbg.
It has a separate back channel that it uses
to communicate information.
It doesn't use the summary API.
But many of the commonly used dashboards do.
And so the summary API actually has sort of--
there's several variations.
And when talking about the variations,
it's useful to think of the API as having two basic halves.
On one half we have the instrumentation surface.
So these logging these are like logging
ops that you place in your model code.
They're pretty familiar to people
who have used the summary API, things like scaler, histogram,
image.
And then the other half of the summary API
is about writing that log data to disk.
And creating a specially formatted log
file which TensorBoard can read and extract the data from.
And so, just to give a sense of how those relate
to the different versions, there's
four variations of the summary API from TF 1.x to 2.0.