Placeholder Image

Subtitles section Play video

  • [MUSIC PLAYING]

  • TIM DAVIS: Hi, everyone.

  • My name's Tim Davis and I'm a product manager in TensorFlow.

  • And I'm presenting today with TJ, a TensorFlow Lite engineer

  • who'll be speaking in a bit.

  • I'm super excited to be here tonight

  • to talk about all the improvements we've

  • made over the last few months for TensorFlow Lite.

  • So, first of all, what is TensorFlow Lite?

  • Hopefully many of you know this by now,

  • but we love to re-emphasize it and also provide

  • context for users who are new to TensorFlow Lite.

  • So TensorFlow Lite is out production ready framework

  • for deploying ML models on mobile devices

  • and embedded systems.

  • TensorFlow Lite can be deployed on Android, iOS,

  • Linux, and other platforms used in edge computing.

  • Now, let's talk about the need for TensorFlow Lite

  • and why we built an on-device ML solution

  • we are in the midst of a huge demand

  • for doing ML on the edge.

  • It's driven by the need for user experiences

  • that require low latency that work in situations

  • with poor network connectivity and enable privacy

  • preserving features.

  • All of these are the reasons why we built TF Lite back in 2017.

  • And just look at our journey since then.

  • As the world's leading ML mobile framework,

  • we have made a ton of improvements across the board.

  • Recently we've increased the ops we support, delivered

  • numerous performance improvements,

  • developed tools which allow you to optimize models

  • with techniques like quantization, increased

  • language support for our APIs and there'll

  • be more on that in a bit.

  • And we're supporting more platforms like GPUs and DSPs.

  • Now, you're probably wondering how many devices are we on now.

  • Boom.

  • TensorFlow Lite is now on more than 4 billion devices

  • around the world--

  • across many different apps.

  • Many of Google's own largest apps are using it,

  • as are a large number of apps from external companies.

  • This is a sampling of some of the apps

  • that use TensorFlow Lite--

  • Google Photos, GBoard, YouTube, and the Assistant,

  • along with really popular third-party apps like Uber,

  • Hike, and many more.

  • So what is TensorFlow Lite being used for you might ask?

  • Developers are using TF Lite for use cases

  • around image, text, and speech, but we

  • are seeing a lot of new and emerging use cases around audio

  • and content generation.

  • For the rest of the talk, we're going

  • to focus on some of the latest updates and highlights.

  • First up, let's talk about how we're helping developers get

  • started quickly and easily with TensorFlow Lite.

  • At TF World we announced the TF Lite Support Library

  • and today we are announcing a series of extensions to that.

  • First, we are adding more APIs such as our image API

  • and introducing new simplistic language APIs,

  • all enabling developers to simplify their development.

  • We are also adding Android Studio Integration,

  • which will be available in Canary in a couple of weeks.

  • That will enable simple drag and drop into Android Studio

  • and then automatically generate Java classes for the TF Lite

  • model with just a few clicks.

  • This is powered by our new CodeGen capability.

  • CodeGen makes it easy for TensorFlow Lite developers

  • to use a TF Lite model by handling the various details

  • around inputs and outputs of the model

  • and saves you a heap of time.

  • Here's a small example to show you what I mean.

  • With the additions to the Support Library,

  • you can now load a model set an input on it

  • and then run the model, then easily get

  • access to the resulting classifications.

  • The CodeGen reads the model metadata

  • and automatically generates the Java wrapper

  • with the model-specific API and code snippet for you.

  • This makes it easy to consume.

  • And develop with TF Lite models without any ML expertise.

  • And this is a sneak peek of how this

  • will look in Android Studio.

  • Here you can see all of the model metadata from the drag

  • and drop TF Lite model.

  • This will then CodeGen Java classes

  • just the image models to begin but later

  • for many different types of TensorFlow Lite models.

  • How cool is that?

  • We're then committed to making mobile ML super, super easy.

  • Check out this in the next couple of weeks on the Android

  • Studio Canary channel.

  • In addition to CodeGen, all of this

  • is made possible through the new extended model metadata.

  • Model authors can provide a metadata spec

  • when creating and converting models

  • making it easier for users of the model

  • to understand how it works.

  • And then how to use it in production.

  • Let's take a look at an example.

  • The metadata descriptor provides additional information

  • about what the model does, the expected format

  • of the model inputs, and the meaning of the model outputs.

  • All of this is encoded via a simple prototype

  • of our new release has tools to help

  • you generate the right metadata for your model.

  • We have made our pretrained model repository much richer

  • and added many more models, all of this

  • is available via TensorFlow Hub.

  • We've got this new mobile-friendly flavors

  • of BERT, including MobileBERT and ALBERT for on-device NLP

  • applications in addition to EfficientNet Lite.

  • Again, all of these are hosted on our central model repository

  • TensorFlow Hub.

  • So check out TF Hub for all the details.

  • Now, let's talk about transfer learning.

  • Having a repository of ready to use models

  • is great for getting started and trying them out,

  • but developers regularly want to customize these models

  • with their own data.

  • That's why we're releasing a set of APIs which

  • makes it easy to customize these models using transfer learning

  • and we're calling this TF Lite Model Maker.

  • It's just four lines of code.

  • You start by specifying your data set,

  • then choose which model spec you would like to use,

  • and, boom, it just works.

  • You can see some stats of how the model performs

  • and lastly export it to a TF Lite model.

  • We've got text and image classification already

  • supported and new use cases like object detection

  • and QA are coming soon.

  • Now, we've got an exciting development

  • in graph delegation.

  • There are multiple ways to delegate your model in TF Lite

  • through GPU, DSP, or through the NN API in Android P and up.

  • Recently, we've added increased GPU performance and DSP

  • delegation through Hexagon.

  • And we've increased the number of supported ops

  • through the NN--

  • Android NN API.

  • But you already knew all that, so what's new?

  • Well, we have a really big announcement

  • that I'm incredibly excited to share today.

  • I'm excited to announce the great news that, as of today,

  • we are launching a Core ML delegate for Apple devices

  • to accelerate floating point models in the latest iPhones

  • and iPads using TensorFlow Lite.

  • The delegate will run on iOS 11 and later.

  • But to get benefits over using TF Lite directly,

  • you want to use this delegate on devices with the Apple Neural

  • Engine.

  • The Neural Engine is dedicated hardware

  • for accelerating machine learning computations

  • on Apple's processors.

  • And it's available on devices with the A12 SoC

  • or later, such as the iPhone XS with iOS 12 or above.

  • With the Neural Engine acceleration,

  • you can get between 4 to 14 times speedup

  • compared to CPU execution.

  • So that's our update on delegates.

  • We've heard from developers about the need for more

  • and better tutorials and examples,

  • so we're releasing several full example apps which

  • show not only how to use a model,

  • but the full end-to-end code that a developer would need

  • to write to work with TF Lite.

  • They work on multiple platforms--

  • Android, iOS, Raspberry Pi, and even the Edge TPU.

  • Now, I'd like to hand over to TJ,

  • who's going to run through some more exciting improvements

  • to TF Lite and dive into the engineering roadmap more.

  • TJ: Awesome.

  • Thanks, Tim.

  • Hi, everyone.

  • I'm TJ, and I'm an engineer on the TensorFlow Lite team.

  • So let's talk about performance.

  • A key goal of TensorFlow Lite is to make your models run

  • as fast as possible on CPUs, GPUs, DSPs,

  • or other accelerators.

  • And we've made serious investment on all these fronts.

  • Recently we've seen significant CPU improvements,

  • added OpenCL support for faster GPU acceleration,

  • and have full support for all Android Q NNAPI

  • ops and features.

  • Our previously announced Qualcomm DSP delegate

  • targeting low and mid-end devices

  • will be available for use in the coming weeks.

  • We've also made some improvements

  • in our benchmarking tooling to better assist model and app

  • developers in identifying optimal deployment

  • configurations.

  • We've even got a few new CPU performance improvements

  • since we last updated you at TensorFlow World,

  • more on that in a bit.

  • To highlight these improvements, let's take a look

  • at our performance about a year ago

  • at Google I/O. For this example, we're using MobileNet V1.

  • And compare that with our performance today.

  • This is a huge reduction in latency.

  • It can be expected across a wide range of models and devices,

  • from low-end to high-end.

  • Just pull the latest version of TensorFlow Lite into your app,

  • and you'll benefit from these improvements

  • without any additional changes.

  • Digging a bit more into these numbers,

  • floating point CPU execution is the default path

  • providing a great baseline.

  • Enabling quantization now made easier

  • with our post-training quantization tooling

  • provides a nearly 3x faster inference.

  • Enabling GPU execution provides an even more dramatic speed up,

  • about 7x faster than our CPA baseline.

  • And for absolute peak performance,

  • we have the Pixel 4's neural core,

  • accessible via the NNAPI TensorFlow Lite delegate.

  • This kind of specialized accelerator

  • available in more and more of the latest phones

  • unlocks capabilities and use cases previously

  • unseen on mobile devices.

  • And we haven't stopped there.

  • Here's a quick preview of some additional CPU

  • optimizations coming soon.

  • In TensorFlow Lite 2.3 we're packing in even more

  • performance improvements.

  • Our model optimization toolkit has

  • allowed you to produce smaller quantized models for some time.

  • We've recently optimized the performance of these models

  • for our dynamic range quantization strategy.

  • So check out our model optimization video

  • coming up later to learn more about quantizing

  • your TensorFlow Lite models.

  • On the floating point side, we have a new integration

  • with the XNNPack library, available through the delegate

  • mechanism, but still running on CPU.

  • If you're adventurous, you can make

  • use of either of these new features

  • by building from source, but we're shipping them

  • in version 2.3 coming soon.

  • Last thing on the performance side--

  • profiling.

  • TensorFlow Lite now integrates with Perfetto,

  • the new standard profiler in Android 10.

  • You can look at overall TFLite inference

  • as well as op level events on the CPU or GPO delegation.

  • Perfetto also supports profiling of heap allocation

  • for tracking memory issues.

  • So that's performance.

  • OK now let's talk model conversion.

  • Seamless and more robust model conversion

  • has been a major focus for the team.

  • So here's an update on our completely new TensorFlow

  • Lite conversion pipeline.

  • This new converter was built from the ground up

  • to provide more intuitive error messages when conversion fails,

  • support control flow operations, and it's

  • why we're able to deploy new NLP models like BERT,

  • or Deep Speech V2, or image segmentation models like Mask

  • R-CNN and more.

  • The new converter is now available in beta

  • and will be available more generally

  • soon as the default option.

  • We want to make it easy for any app developer

  • to use TensorFlow Lite.

  • To that end, we've released a number of new first-class

  • language bindings including Swift, Obj-C, C# for Unity,

  • and more.

  • Thanks to community efforts, we've

  • seen the creation of additional TensorFlow Lite language

  • bindings in Rust, Go, and Dart.

  • This is in addition to our existing C+, Java,

  • and Python bindings.

  • Our model optimization toolkit remains the one-stop shop

  • for compressing and optimizing your models.

  • And it's now easier than ever to use

  • with post-training quantization.

  • So check out the optimization talk coming up later

  • for more details.

  • Finally, I want to talk a bit about our efforts

  • in enabling ML not just for billions of phones,

  • but also for the hundreds of billions of embedded devices

  • and microcontrollers used in production globally.

  • TensorFlow Lite for microcontrollers is that effort

  • and uses the same model format, converter pipeline, and kernel

  • library as TensorFlow Lite.

  • So what are these microcontrollers.

  • These are small, low power, all-in-one computers that power

  • everyday devices all around us like microwaves, smoke

  • detectors, toys, and sensors.

  • And with TensorFlow, it's now possible to use them

  • for machine learning.

  • You might not realize that embedded ML is already in use

  • in devices you use everyday.

  • For example, hot word detection like OK Google

  • on many smartphones typically runs

  • on a small DSP, which then wakes up the rest of the phone.

  • You can now use TensorFlow Lite to run models on these devices

  • with the same tooling and frameworks.

  • We're partnering with a number of industry leaders

  • in this area in particular arm, an industry

  • leader in the embedded market, has adopted

  • TensorFlow as the official solution for AI

  • on arm microcontrollers.

  • And together we've made optimizations

  • that significantly improve performance

  • on embedded arm hardware.

  • Another recent update-- we've partnered with Arduino and just

  • launched the official Arduino TensorFlow library.

  • This makes it possible to start doing

  • speech detection on Arduino hardware in about five minutes.

  • You create your machine learning models using TensorFlow Lite

  • and upload them to your board using the Arduino IDE.

  • If you're curious about trying this out,

  • this library is available for download today.

  • So that's a look at where we are today.

  • Going forward, we're continuing to expand

  • the set of supported models, make

  • further improvements to performance,

  • as well as some more advanced features

  • like on-device training and personalization.

  • So please check out our roadmap on TensorFlow.org

  • and give us feedback.

  • We'd love to hear from you.

  • [MUSIC PLAYING]

[MUSIC PLAYING]

Subtitles and vocabulary

Click the word to look it up Click the word to find further inforamtion about it