Subtitles section Play video Print subtitles [MUSIC PLAYING] EMILY GLANZ: Hi, everyone. Thanks for joining us today. I'm Emily, a software engineer on Google's federated learning team. DANIEL RAMAGE: And I'm Dan. I'm a research scientist and the team lead. We'll be talking to day about Federated Learning-- machine learning on decentralized data. The goal of federated learning is to enable edge devices to do state-of-the-art machine learning without centralizing data and with privacy by default. And, with privacy, what we mean is that we have an aspiration that app developers, centralized servers, and models themselves learn common patterns only. That's really what we mean by privacy. In today's talk, we'll talk about decentralized data, what it means to work with decentralized data in a centralized fashion. That's what we call federated computation. We'll talk a bit about learning on decentralized data. And then we'll give you an introduction to TensorFlow Federated, which is a way that you can experiment with federated computations in simulation today. Along the way, we'll introduce a few privacy principles, like ephemeral reports, and privacy technologies, like federated model averaging that embody those principles. All right, let's start with decentralized data. A lot of data is born at the edge, with billions of phones and IoT devices that generate data. That data can enable better products and smarter models. You saw in yesterday's keynote a lot of ways that that data can be used locally at the edge, with on-device inference, such as the automatic captioning and next generation assistant. On-device inference offers improvements to latency, lets things work offline, often has battery life advantages, and can also have some substantial privacy advantages because a server doesn't need to be in the loop for every interaction you have with that locally-generated data. But if you don't have a server in the loop, how do you answer analytics questions? How do you continue to improve models based on the data that those edge devices have? That's really what we'll be looking at in the context of federated learning. And the app we'll be focusing on today is Gboard, which is Google's mobile keyboard. People don't think much about their keyboards, but they spend hours on it each day. And typing on a mobile keyboard is 40% slower than on a physical one. It is easier to share cute stickers, though. Gboard uses machine-learned models for almost every aspect of the typing experience. Tap typing, gesture typing both depend on models because fingers are a little bit wider than the key targets, and you can't just rely on people hitting exactly the right keystrokes. Similarly, auto-corrections and predictions are powered by learned models, as well as voice to text and other aspects of the experience. All these models run on device, of course, because your keyboard needs to be able to work offline and quickly. For the last few years, our team has been working with the Gboard team to experiment with decentralized data. Gboard aims to be the best and most privacy forward keyboard available. And one of the ways that we're aiming to do that is by making use of an on-device cache of local interactions. This would be things like touch points, type text, context, and more. This data is used exclusively for federated learning and computation. EMILY GLANZ: Cool. Let's jump in to federated computation. Federated computation is basically a MapReduce for decentralized data with privacy-preserving aggregation built in. Let's introduce some of the key concepts of federated computations using a simpler example than Gboard. So here we have our clients. This is a set of devices-- some things like cell phones, or sensors, et cetera. Each device has its own data. In this case, let's imagine it's the maximum temperature that that device saw that day, which gets us to our first privacy technology-- on-device data sets. Each device keeps the raw data local, and this comes with some obligations. Each device is responsible for data asset management locally, with things like expiring old data and ensuring that the data is encrypted when it's not in use. So how do we get the average maximum temperature experienced by our devices? Let's imagine we had a way to only communicate the average of all client data items to the server. Conceptually, we'd like to compute an aggregate over the distributed data in a secure and private way, which we'll build up to throughout this talk. So now let's walk through an example where the engineer wants to answer a specific question of the decentralized data, like what fraction of users saw a daily high over 70 degrees Fahrenheit. The first step would be for the engineer to input this threshold to the server. Next, this threshold would then be broadcast to the subset of available devices the server has chosen to participate in this round of federated computation. This threshold is then compared to the local temperature data to compute a value. And this is going to be a 1 or a 0, depending on whether the temperature was greater than that threshold. Cool. So these values would then be aggregated using an aggregation operator. In this case, it's a federated mean, which encodes a protocol for computing the average value over the participating devices. The server is responsible for collating device reports throughout the round and emitting this aggregate, which contains the answer to the engineer's question. So this demonstrates our second privacy technology of federated aggregation. The server is combining reports from multiple devices and only persisting the aggregate, which now leads into our first privacy principle of only an aggregate. Performing that federated aggregation only makes the final aggregate data, those sums and averages over the device reports, available to the engineer, without giving them access to an individual report itself. So this now ties into our second privacy principle of ephemeral reports. We don't need to keep those per-device messages after they've been aggregated, so what we collect only stays around for as long as we need it and can be immediately discarded. In practice, what we've just shown is a round of computation. This server will repeat this process multiple times to get a better estimate to the engineer's question. It repeats this multiple times because some devices may not be available at the time of computation or some of the devices may have dropped out during this round. DANIEL RAMAGE: So what's different between federated computation and decentralized computation in the data center with things like MapReduce? Federal computation has challenges that go beyond what we usually experience in distributed computation. Edge devices like phones tend to have limited communication bandwidth, even when they're connected to a home Wi-Fi network. They're also intermittently available because the devices will generally participate only if they are idle, charging, and on an unmetered network. And because each compute node keeps the only copy of its data, the data itself has intermittent availability. Finally, devices participate only with the user's permission, depending on an app's policies. Another difference is that in a federated setting, it is much more distributed than a traditional data center distributed computation. So to give you a sense of orders of magnitude, usually in a data center, you might be looking