Placeholder Image

Subtitles section Play video

  • QIUMIN XU: Hello, everyone.

  • I am Qiumin.

  • I am a software engineer at Google working

  • on TensorFlow Performance.

  • Today I'm very excited to introduce you

  • to our brand new TensorFlow 2 Performance Profiler.

  • We all like speed, and we want our models to run faster.

  • TensorFlow 2 Profiler can help you

  • improve your model performance like a professional player.

  • In this talk, we're going to first talk

  • about what's new in TF2 Profiler,

  • and then we'll show you a case study.

  • I'm a performance engineer, and this

  • is how I used to start my day.

  • In the morning, I ran a model and a capture trace of it.

  • I would gather the profiling results

  • in a spreadsheet to analyze the bottlenecks

  • and optimize the model.

  • We also have gigabytes of traces,

  • and to process all of them manually

  • is boring and time-consuming.

  • Then, after that, we run the model again

  • to check for performance.

  • If your performance is quite good,

  • hooray, we have done our job.

  • Go and grab coffee.

  • Otherwise, we will go back to step one--

  • recapture a profile, gather results,

  • and find out the reason, fix it, and try again.

  • Repeat this iteration by n times until the performance is good.

  • This is a typical day of a performance engineer.

  • Can we make it more productive?

  • The most repeated work here is to gather the trace information

  • and analyze the result. We always want to work smarter.

  • At Google, we find out a way to build

  • tools to automatically process other traces,

  • analyze them, and provide automated performance guidance.

  • It does intensive trace analysis,

  • learns from how Google internal experts tune the performance

  • and automate it for non-expert users.

  • Here's the thing I'm very excited about.

  • We are releasing this most useful set of internal tools

  • today as a TF2 Profiler.

  • The same set of tools in TF2 Profiler

  • has been used extensively inside Google,

  • and we are making it available to public.

  • Let me introduce you to the toolset.

  • Today, we will launch eight tools.

  • Four of them are common to CPU, GPU, and TPUs.

  • This enables consistent metrics and analysis

  • across different platforms.

  • The first tool is called Overview Page.

  • This tool provides an overview of the performance

  • of the workload running on the device.

  • The second tool is Input Pipeline Analyzer.

  • It is very powerful tool to analyze the TensorFlow Input

  • Pipeline.

  • TensorFlow rates data from the files in the pipeline demand.

  • And an inefficient input pipeline severely

  • slows down your application.

  • This tool presents an in-depth analysis of your model input

  • pipeline performance, based on various performance

  • data collected.

  • At the high level, this tool tells you

  • whether your program is input bound.

  • If that is the case, the tool can also

  • walk you through the device and the host-side analysis

  • to debug which stage of the pipeline is the bottleneck.

  • The third tool we released today is called TensorFlow Stats.

  • TensorFlow Stats presents TensorFlow ops statistic

  • in charts and tables.

  • The fourth tool we released today is called Trace Viewer.

  • Trace Viewer tool displays detailed event timeline

  • for in-depth performance debugging.

  • We also provide four tools that are TPU or GPU specific.

  • They are all available today on TensorFlow.

  • Please check out.

  • Now let's look at the case study.

  • Let's assume that we are running an un-optimized Resnet50

  • Model on a V100 GPU.

  • TF2 Profiler provides a number of ways to capture a profile.

  • In this talk, we will focus on Keras callback.

  • To check out other ways of profiling,

  • including sampling and the programatically profiling,

  • refer to TensorFlow docs for more details.

  • Using Keras TensorBoard callback,

  • we simply need to add an additional line specifying

  • profiling range.

  • The argument profile_batch equals to 150 to 160

  • here indicates we are start to profile from batch 150 to 160.

  • Run a model, launch TensorBoard, and go to the Profile plugin.

  • Here's a Performance Overview.

  • Let's remain and look at the Performance Overview page.

  • It contains three sections--

  • Performance Summary, Step-time Graph,

  • and the Recommendation for the Next Step.

  • Let's zoom into each of them.

  • First, let's look at the performance summary.

  • It shows the average step-time and breaks

  • it down into the time spent on compilation,

  • input output, kernel lunches, and the communication time.

  • The next is a step-time graph.

  • We can see the step-time is broken down

  • into compilation time, kernel launch,

  • compute, compute communication as well,

  • and you can see how these breakdown changes

  • over a number of steps.

  • In this example, there's a lot of redness in this chart,

  • and indicates it is severely input bound.

  • The next is what I feel most excited about.

  • This is the recommendation provided by our tool.

  • Assess-- your program is highly input bound

  • because 81.4% of the total step-time sampled

  • is waiting for input.

  • Therefore, we should first focus on reducing the input time.

  • Overview page also provides a recommendation on which tool

  • you should check out next.

  • In this example, Input Pipeline Analyzer and the Trace Viewer

  • are the next tools to see.

  • In addition, this tool also suggests the related useful

  • resources to check out to improve the input pipeline.

  • Let's follow this recommendation and check out the Input

  • Pipeline Analyzer tool.

  • See, this is the host analysis breaking down,

  • provided by the tool.

  • It automatically detects the most time

  • spent on the data processing.

  • What should we do next?

  • Our tool actually tells you what can

  • be done next to reduce the data preprocessing.

  • This is what is recommended by our tool.

  • You may increase the number of parallel calls

  • in the dataset map or process the data offline.

  • If you follow the link on the dataset map,

  • you will see how to do that.

  • According to the guide, we change the sequential map

  • to use a parallel course.

  • We are also not to forget to try the most convenient

  • autotune team option, which will tune the value

  • dynamically at runtime.

  • After this optimization, let's capture a new profile.

  • Now you can see the redness is all

  • gone in the step-time graph, and the model

  • is no longer input bound.

  • Checking the performance summary again, now you get 5x speedup.

  • Overview page now recommends differently.

  • It says your program is not input bound because only 0.1%

  • of the total step-time sample is waiting for input.

  • Therefore, you should instead focus on reducing other time.

  • Here's another thing we can do.

  • If you look at the other recommendations,

  • the model is all using 32 bits.

  • If you replace all of them by 16 bits, you can get 10x speedup.

  • This release is just the beginning,

  • and we have more features upcoming.

  • We are working on Keras-specific analysis

  • and the multiworker GPU analysis.

  • Stay tuned.

  • We also welcome your feedbacks, and please let us

  • know and contribute your ideas.

  • TensorFlow 2 Profiler is the tool

  • you need for investigating TF2 performance.

  • It works on CPU, GPU, and TPU.

  • Here's more things to read--

  • a tutorial, guide, and Github source code.

  • There are also two more related talks on performance

  • tuning in this afternoon.

  • They are super exciting, and don't miss them.

  • Finally, I want to thank everyone

  • who worked on this project.

  • You are super amazing teammates.

QIUMIN XU: Hello, everyone.

Subtitles and vocabulary

Click the word to look it up Click the word to find further inforamtion about it

B1 performance input tool profiler pipeline trace

Performance profiling in TF 2 (TF Dev Summit '20)

  • 21150 628
    林宜悉 posted on 2020/03/25
Video vocabulary

Keywords

specific

US /spɪˈsɪfɪk/

UK /spəˈsɪfɪk/

  • adjective
  • Relating to a particular species, structure, etc.
  • Precise; particular; just about that thing
  • Relating to a particular thing.
  • Clearly defined or identified.
  • Stated clearly and in detail, leaving no room for confusion or doubt.
  • Concerning one particular thing or kind of thing
process

US /ˈprɑsˌɛs, ˈproˌsɛs/

UK /prə'ses/

  • verb
  • To organize and use data in a computer
  • To deal with official forms in the way required
  • To prepare by treating something in a certain way
  • To adopt a set of actions that produce a result
  • To convert by putting something through a machine
  • noun
  • A series of actions or steps taken in order to achieve a particular end.
  • A summons or writ to appear in court or before a judicial officer.
  • A systematic series of actions directed to some end
  • Dealing with official forms in the way required
  • Set of changes that occur slowly and naturally
  • A series of actions or steps taken in order to achieve a particular end.
  • other
  • To perform a series of operations on (data) by a computer.
  • To deal with (something) according to a particular procedure.
  • Deal with (something) according to a set procedure.
  • To perform a series of mechanical or chemical operations on (something) in order to change or preserve it.
  • To perform a series of mechanical or chemical operations on (something) in order to change or preserve it.
  • Take (something) into the mind and understand it fully.
  • other
  • Deal with (something, especially unpleasant or difficult) psychologically in order to come to terms with it.
recommend

US /ˌrɛkəˈmɛnd/

UK /ˌrekə'mend/

  • verb
  • To advise or suggest that someone do something
  • To suggest something as good or suitable.
  • other
  • To advise someone to do something.
  • To endorse or support something publicly.
  • To suggest something as good or suitable.
improve

US /ɪmˈpruv/

UK /ɪm'pru:v/

  • verb
  • To make, or become, something better
  • other
  • To become better than before; to advance in excellence.
  • To become better
  • other
  • To make something better; to enhance in value or quality.
  • To make something better; to raise to a more desirable quality or condition.
consume

US /kənˈsum/

UK /kən'sju:m/

  • verb
  • To eat, drink, buy or use up something
  • To take all your energy; focus the attention
  • other
  • To destroy completely; to engulf.
  • To eat, drink, or ingest (food or drink).
  • To eat or drink something
  • To completely fill someone's mind
  • To completely engross or absorb someone's attention or energy.
  • To use up (resources or energy).
productive

US /prəˈdʌktɪv, pro-/

UK /prəˈdʌktɪv/

  • adjective
  • Producing or able to produce large amounts of goods, crops, or other commodities.
  • Producing things in large quantities; fertile
  • Producing or able to produce large amounts of goods, crops, or other commodities.
  • Doing a lot of work and achieving a lot
  • Achieving a significant amount or result; efficient.
  • Capable of generating or producing something, especially crops.
  • Producing or capable of producing a large amount of something.
  • Concerning speaking or writing
  • (Of a linguistic element) able to form new words.
  • Serving a useful purpose; contributing to something.
automatically

US /ˌɔtəˈmætɪkl:ɪ/

UK /ˌɔ:tə'mætɪklɪ/

  • adverb
  • In a way not requiring control by a person
  • In a manner that happens by itself, without direct human control or intervention.
parallel

US /ˈpærəˌlɛl/

UK /'pærəlel/

  • adjective
  • Happening at the same time or in a similar way.
  • (of a computer process) performed simultaneously using multiple processors.
  • Arranged side by side; relating to a parallel circuit.
  • Being in direct correspondence; analogous
  • Extending in the same direction, equidistant at all points, and never converging or diverging.
  • (Of two lines) at equal distance from each other
  • (Of computer operation) happening at the same time
  • Very similar and often occurring at the same time
  • verb
  • To be equal to, or like, something else
  • To compare
  • other
  • Be similar or analogous to.
  • To be similar or analogous to.
  • Be similar or analogous to.
  • noun
  • A similarity; a comparison.
  • A parallel device or circuit.
  • A line on a map a set distance from the equator
  • Each of the imaginary parallel circles of constant latitude on the earth’s surface.
  • A line of latitude.
  • A similarity; a comparison.
release

US /rɪ'li:s/

UK /rɪ'li:s/

  • other
  • To allow (something) to be available or to be made public.
  • To allow (something) to flow out or be emitted.
  • To allow or enable to escape from confinement; set free.
  • To allow (a substance) to flow out from somewhere.
  • To allow (a feeling or emotion) to be expressed.
  • To allow (someone) to be free from a duty or obligation.
  • To allow (something) to fall, drop, or escape.
  • To allow (something) to be available or accessible.
  • To make (something) available to the public.
  • To allow (something) to fall or loosen one's hold on it.
  • To allow or enable to escape from confinement; set free.
  • noun
  • The action or process of making a film, recording, or other product available to the public.
  • The easing of something painful or oppressive.
  • The action of setting someone free from imprisonment or detention.
  • Act of freeing someone from a duty or burden
  • Introduction of a new product, film, book, etc.
  • Relief from sadness, suffering or trouble
  • Act of freeing from a jail, cage, prison, etc.
  • A written statement giving up a legal right.
  • A catch or other device that allows a part to be freed or detached.
  • The action or process of allowing someone to leave a place where they have been kept or confined.
  • A version of a product that is offered for sale or distribution.
  • A public statement or announcement.
  • The action or process of making a product available to the public.
  • verb
  • To free someone from a responsibility or burden
  • To allow a film, music etc. to be sold/distributed
  • To allow to leave a jail, cage, etc.; let out
  • To let go of something you are holding
  • other
  • The action of setting someone free from imprisonment or constraint.
launch

US /lɔntʃ, lɑntʃ/

UK /lɔ:ntʃ/

  • other
  • To begin or initiate (something such as an attack or a military operation).
  • To put a boat or ship into the water.
  • To introduce (a new product or publication) to the public.
  • To send off with force.
  • To send (a rocket, satellite, or spacecraft) into the air or space.
  • To start or set in motion.
  • noun
  • A large motorboat.
  • Starting a new project; introducing new product
  • The act of sending off with force.
  • Act of firing rockets into the air
  • The act or process of launching something.
  • verb
  • To start a new project; start selling a product
  • To put a rocket into the air
  • To put a ship into the water for the first time