Subtitles section Play video Print subtitles AMIT SHARMA: Hi, all. Welcome to this session on Causality and Machine Learning as a part of Frontiers in Machine Learning. I'm Amit Sharma from Microsoft Research and your host. Now of course I presume you would all agree that distinguishing correlations from causation is important. Even at Microsoft, for example, when we're deciding which product feature to ship or when we're making business decisions about marketing, causality is important. But in recent years, what we're also finding is that causality is important for building predictive machine learning models as well. So especially if you're interested in out-of-domain generalization having your models not brittle, you need causal reasoning to make them robust. And in fact there are interesting results even about adverse robustness and privacy where causality may play a role. This is an interesting time at the intersection of causality and machine learning. And we now have a group at Microsoft as well that is looking at these connections. I'll post a link in the chat. But for now, today I thought we can ask, all ask this question what are the big ideas that will drive further this conversation between causality and ML. And I'm glad that today we have three really exciting talks. Our first talk is from Susan Athey, economics of technology professor from Stanford. She'll talk about the challenges and solutions for decision-making under high dimensional and how generative data modeling can help. And in fact when I started in causality, Susan's work was one of the first I saw that was making connections between causality and machine learning. I'm looking forward to her talk. And next we'll have Elias Bareinboim, who will be talking about the three kinds of questions we typically want to ask about data and how two of them turn out to be causal and they're much harder. And he'll also talk about an interesting emerging new field, causal reinforcement learning. And then finally we'll have Cheng Zhang from Microsoft Research Cambridge. She'll talk about essentially give a recipe for how to build models, neural networks that are robust to adversal attacks. And by now you've guessed in the session she'll use causal reasoning. And at the end we'll have 20 minutes for open discussion. All the speakers will be live for your questions. Before we start, let me tell you one quick secret. All these talks are prerecorded. So if you have any questions during the talk, feel free to just ask those questions on the hub chat itself and our speakers are available to engage with you on the chat even while the talk is going on. With that, I'd like to hand it over to Susan. SUSAN ATHEY: Thanks so much for having me here today in this really interesting session on machine learning and causal inference. Today I'm going to talk about the application of machine learning to the problem of consumer choice. And I'm going to talk about some results from a couple of papers I've been working on that analyze how firms can use machine learning to do counterfactual inference for questions like how should I change prices or how should I target coupons. And I'll also talk a little bit about the value of different types of data for solving that problem. Doing counterfactual inferences is substantially harder than prediction. There can be many data situations where it's actually impossible to estimate counterfactual quantities. It's essential to have the availability of experimental or quasi experimental variation in the data to separate correlation from causal effects. That is, we need to see whatever treatment it is we're studying, that needs to vary for reasons that are unrelated to other unobservables in the model. We need the treatment assignment to be as good as random after adjusting for other observables. We also need to customize machine learning optimization for estimating causal effects and counterfactual of interest instead of for prediction. And indeed, model selection and regularization need to be quite different if the goal is to get valid causal estimates. That's been a focus of research, including a lot of research I've done. A second big problem in estimating causal effects is statistical power. In general, historical observational data may not be informative about causal effects. If we're trying to understand what's the impact of changing prices, if prices always change in the past in response to demand shocks, then we're not going to be able to learn what would happen if I change the price at a time when there wasn't demand shock. I won't have data from that in the past. I'll need to run an experiment or I'm going to need to focus on just a few price changes or use statistical techniques that focus my estimation on a small part of the variation of the data. Any of those things is going to lead to a situation where I don't have as much statistical power as I would like. Another problem is effect sizes are often small. Firms are usually already optimizing pretty well. It will be surprising if making