Placeholder Image

Subtitles section Play video

  • [MUSIC PLAYING]

  • JACQUES PIENAAR: Good afternoon, everybody.

  • I am Jacques, and I'll be filling in for Tatiana today,

  • presenting on MLIR, accelerating TensorFlow with compilers.

  • Now, I don't think I need to tell anybody in this room

  • that machine learning is everywhere.

  • There's a wide range of deployments happening

  • in the industry today--

  • inference and training happening on the cloud and to the edge.

  • We also have models getting larger and larger,

  • and the computational requirements for training

  • these models ever increasing.

  • We see a near exponential growth of the complexity and size

  • and the computational requirements

  • for training these models.

  • Now, if you combined the growth in different deployment

  • strategies, as well as models, velocity is a must.

  • We need a faster, more scalable way

  • to build infra to keep up with these bigger complex models

  • and deployment scenarios.

  • So we need to build these email systems faster.

  • We want to unify efforts for extensibility and reusability,

  • while allowing customization as needed.

  • So we want to be able to standardize representation

  • of some basic concepts such as operations and types.

  • What defines an operation?

  • How do you define operation or a type?

  • We want to create a common framework of reusable passes

  • that you can combine to create your own solutions.

  • And also, we want to make it such

  • that it's fully customizable and extensible.

  • The deployment scenarios and models of five years ago

  • differ greatly from what we have today,

  • and so we want a system that's able to scale and adapt

  • for all the future needs.

  • With that inter MLIR, we designed

  • MLIR which stands for multi-level intermediate

  • representation.

  • It's an intermediate representation and compiler

  • framework for TensorFlow and beyond as part

  • of the [INAUDIBLE] project.

  • So what is MLIR, and why do we believe

  • it's a compiler infrastructure for machine learning?

  • Well, for one, MLIR is state of the art compiler technology.

  • It's not just a serialization format,

  • and there's nothing like it.

  • MLIR is modular and extensible.

  • You can build different solutions using MLIR--

  • building blocks that suit your solution.

  • Importantly, MLIR is not opinionated.

  • MLIR does not try and force you into a box.

  • It allows you to create a solution for your problem

  • space.

  • MLIR is also fully customizable.

  • These different deployment scenarios

  • needs different ways of integrating the components,

  • and with MLIR, we want to make it

  • easy for all of these different deployment scenarios to work.

  • Finally, MLIR is part of the [INAUDIBLE] project.

  • It's [INAUDIBLE] governments and effectively

  • on the desk of many compiler developers all around the world

  • already.

  • And the industry agrees.

  • MLIR is strongly supported by our partners.

  • Some of our partners includes the largest hardware partners

  • in the world, consisting of 95% of the data center

  • hardware, four billion mobile phones,

  • and countless of IoT devices.

  • Importantly, MLIR is an open community

  • of academia and industry all working together

  • to solve this problem of compiling machine learning

  • models.

  • So if MLIR-- what if we want to use it for TensorFlow?

  • Well, we want to use it to build a better TensorFlow.

  • We want to build better user experience, as well as better

  • pluggable hardware support.

  • Now, if you're a user, we want to make it easier

  • for you to debug your model.

  • We want to make optimizations transparent

  • and see what's going on.

  • We want to make it-- if you have an error

  • message in your optimized model, we

  • want to be able to track it back to your original location,

  • and MLIR's location tracking enables this.

  • And, of course, we want faster performance.

  • So going from writing a model to actually

  • being able to get good performance on your hardware

  • is essential.

  • And speaking of hardware, for our hardware partners,

  • we know it's an awesome time.

  • There is so many new generations of accelerators coming up

  • and new accelerators, and we want

  • to make it simpler and easier to integrate with TensorFlow.

  • Because while [INAUDIBLE] accelerators are great time,

  • it's only really interesting when it's usable for our users.

  • And, of course, for researchers, we

  • want to provide the standard infrastructure for research.

  • So going from being able to represent

  • the different optimization passes

  • and running it in an end to end workflow on some production

  • models, we want to make it easy for these researchers

  • to try new approaches, see new effects, and if it works well,

  • of course, contribute it.

  • So let's take another of a closer

  • look at MLIR, the progressive lowering,

  • and the infrastructure around MLIR.

  • Now, you've seen this before in the TensorFlow architecture.

  • And if we zoom in a little bit, we

  • can expand the different opponents.

  • But let's focus on the parts where MLIR will be used.

  • So a lot of these, as I mentioned before,

  • the graph representation and optimization format

  • for these TensorFlow models, but also particularly

  • for-- in the compilation.

  • So for optimization and conversion

  • passes between different computing frameworks,

  • to compilation of modules, as well as actually

  • for writing AOT kernels, or generating AOT kernels,

  • or exploiting these [? handwritten ?] kernels,

  • MLIR will be involved in all of these different parts.

  • So as the previous slide showed, we

  • can and will be using MLIR to do many tasks in TensorFlow

  • from graph organizations, operation rewrites

  • and lowerings, graph transformations,

  • creating frameworks and components, to code generation.

  • So you think of MLIR as a common graphic representation

  • and legalization framework.

  • It's a common set of optimizations and conversion

  • passes, as well as a full code generation pipeline.

  • But importantly, as I mentioned, MLIR is modular,

  • so you can tailor it for your use case.

  • You can use what you need to solve your problems.

  • So for example, you can reconfigure MLIR

  • for a [? graphic ?] writing so you can--

  • and that's, for example, how we use it for the new TensorFlow

  • or TensorFlow Lite converter--

  • just using the parts we actually need to get the final product

  • that we want.

  • So what about-- so let's talk a little bit

  • about progressive lowering.

  • The ML in MLIR stands for multi-level.

  • MLIR enables you to represent multiple different levels

  • of operations, all in the same IR.

  • So from a TensorFlow operation to XLA HLO to LLVM IR

  • all can be represented in MLIR.

  • You can lower, progressively, from one form to another,

  • and you don't need-- and all of these can coexist together.

  • So for example, you can have a function that

  • actually has [INAUDIBLE] and HLO up and LLVM IR.

  • This ability to mix and match these different levels

  • of abstractions and dialects gives great power

  • in actually modeling the problems

  • to suit what your hardware specialization needs.

  • But what about XLA?

  • So we're using what we learned from XLA to build MLIR.

  • XLA is a great exhilaration tool for models with stable tensor

  • shapes.

  • And so for example, the TF function API in TensorFlow 2.2

  • enables great performance improvements, exploiting XLA,

  • and we've made sure that they work really well together.

  • And we are working on ensuring that there's

  • full interoperability between MLIR and XLA.

  • And speaking of full interoperability,

  • we are working very hard to make MLIR

  • and all existing TensorFlow components all

  • interact very well.

  • So whether you want to import or export

  • from a graph, XLA HLO proto, or TF Lite Flatbuffer,

  • all of these are possible.

  • So you can mix and match your workflows with XLA.

  • Importantly, MLIR allows for open integration

  • at any level of the stack.

  • So you can start with a TensorFlow graph,

  • import it into MLIR, lower it to HLO, [INAUDIBLE] HLOs,

  • or go further and lower it to LLVM IR

  • and then [? Code Gen. ?] MLIR allows you to hook

  • into any parts of this configuration,

  • and in particular, MLIR does not require that you only use one,

  • so if for your problem you need a combination

  • of these ops, that's possible.

  • So this makes it very easy to incrementally enable

  • MLIR in conjunction with your existing tools.

  • Now, let's look at MLR in action.

  • So we'll just take a look at the new TF Lite converter,

  • as well as the new features provided by MLIR there.

  • Now, the new TF to TF Lite converter

  • launched just in February this year.

  • Very excited about it.

  • So starting from a TensorFlow graph model,

  • importing it to MLIR, doing all the optimizations,

  • legalizations, and then finally exporting

  • to TF Lite Flatbuffer for TensorFlow

  • Lite to runtime to execute.

  • All of these with better error messages--

  • so being able to find out what went wrong during conversions

  • and give more actual feedback.

  • To support for TensorFlow control flow,

  • you can finally deploy some of these models

  • with control flow on the edge, and also

  • with a new unified quantization workflow.

  • Now, looking ahead beyond the converter,

  • you'll see MLIR in action in a lot of different places

  • in TensorFlow.

  • In particular, I mentioned MLIR as being the graph

  • representation optimization framework in TensorFlow,

  • so we'll be unifying the different graph optimization

  • infrastructure that we have, as well

  • as all the different converters using MLIR.

  • Another part that's very important for us

  • is the partner integration and supporting for new hardware.

  • As I mentioned, new hardware is coming up every day.

  • We want to make it very easy for folks

  • to integrate with TensorFlow.

  • So especially if you're a partner,

  • please reach out to TensorFlow team if you want to get

  • involved in this discussion.

  • And also, for code generation, we're enhancing MLIR.

  • We're looking at more advanced code generation, particularly

  • code generation with dynamic shapes.

  • And MLIR is also integrating very tightly

  • with optimization and code gen with the new TensorFlow

  • runtime.

  • So there's many different ways of getting involved.

  • Like I mentioned, MLIR is a open community.

  • We have open design meetings where everybody

  • can sign in and ask questions.

  • There's talks from the team, from other teams.

  • We have the TensorFlow MLIR special interest group.

  • And of course we have code on GitHub,

  • [INAUDIBLE] repo, as well as TensorFlow repo.

  • So feel free to send some [? PRs ?] and fix some bugs

  • and add new features and get involved.

  • And with that, thank you.

  • [MUSIC PLAYING]

[MUSIC PLAYING]

Subtitles and vocabulary

Click the word to look it up Click the word to find further inforamtion about it