Subtitles section Play video Print subtitles Let's review a little bit of everything we learned so far and hopefully it'll make everything fit together a little bit better. Then we'll do a bunch of calculations with real numbers and I think it'll really hit the point home. So, first of all if we're dealing with a-- let me actually write down, let me make some columns. So if we're dealing with-- let's see, we could call it the concept and then we'll call it whether we're dealing with a population or a sample. So the first statistical concept we came up with was the notion of the mean or the central tendency and we learned of that was one way to measure the average or central tendency of a data set. The other ways were the median and the mode. But the mean tends to show up a lot more, especially when we start talking about variances and, as we'll do in this video, the standard deviation. But the mean of a population we learned-- we use the greek letter Mu-- is equal to the sum of each of the data points in the population. That's an i. Let me make sure it looks like an I. So you're going to sum up each of those data points. You're going to start with the first one and you're going to go to the nth one. We're assuming that there are n data points in the population. And then you divide by the total number that you have. And this is like the average that you're used to taking before you learned any of the statistics stuff. You add up all the data points and you divide by the number there are. The sample is the same thing. We just use a slightly different terminology. The mean of a sample-- and I'll do it in a different color-- just write it as x with a line on top. And that's equal to the sum of all the data points in the sample. So each of the xi in the sample. But we're serving the sample is something less than a population. So you start with the first one still. And then you go to the lower case n where we assume that lowercase n is less than the big N. If this was the same thing then we're actually taking the average or we're taking the mean of the entire population. And then you divide by the number of data points you added. You get to n. Then we said OK, how far-- this give us the central tendency. It's one measure of the central tendency. But what if we wanted to know how good of an indicator this is for the population or for the sample? Or, on average, how far are the data points from this mean? And that's where we came up with the concept of variance. And I'll arbitrarily switch colors again. Variance. And in a population the variable or the notation for variance is the sigma squared. This means variance. And that is equal to-- you take each of the data points. You find the difference between that and the mean that you calculate up there. You square it so you get the squared difference. And then you essentially take the average of all of these. You take the average of all of these squared distances. So that's-- so you take the sum from i is equal to 1 to n and you divide it by n. That's the variance. And then the variance of a sample mean-- and this was a little bit more interesting and we talked a little bit about it in the last video. You actually want to provide a-- you want to estimate the variance of the population when you're taking the variance of a sample. And in order to provide an unbiased estimate you do something very similar to here but you end up dividing by n minus 1. So let me write that down. So the variance of a population-- I'm sorry, the variance of a sample or samples variance or unbiased sample variance if that's why we're going to divide by n minus 1. That's denoted by s squared. What you do is you take the difference between each of the data points in the sample minus the sample mean. We assume that we don't know the population mean. Maybe we did. If we knew the population mean we actually wouldn't have to do the unbiased thing they were going to do here in the denominator. But when you have a sample the only way to kind of figure out the population mean is to estimate it with sample mean. So we assume that we only have the sample mean. And you're going to square those and then you're going to sum them up from i is equal to 1 to i is equal to n because you have n data points. And if you want an unbiased estimator you divide by n minus 1. And we talked a little bit before why you want this to be a n minus 1 instead of a n. And actually in a couple of videos I'll actually prove this to you. One, I'll prove it maybe experimentally using Excel and then I'll-- which wouldn't be a proof, it'll just give you a little bit of intuition-- and then I'll actually prove it a little bit more formally later on. But you don't have to worry about it right now. The next thing we'll learn is something that you've probably heard a lot of, especially sometimes in class, teachers talk about the standard deviation of a test or-- it's actually probably one of the most use words in statistics. I think a lot of people unfortunately maybe use it or maybe use it without fully appreciating everything that it involves. But the goal we'll eventually hopefully appreciate all that involves soon. But the standard deviation-- and once you know variance it's actually quite straightforward. It's the square root of the variance. So the standard deviation of a population is written as sigma which is equal to the square root of the variance. And now I think you understand why a variance is written as sigma squared. And that is equal to just the square root of all that. It's equal to the square root-- I'll probably run out of space-- of all of that. So the sum-- I won't write at the top or the bottom, that makes it messy-- if xi minus Mu squared, everything over n. And then if you wanted the standard deviation of a sample-- and it actually gets a little bit interesting because the standard deviation of a sample, which is equal to the square root of the variance of a sample-- it actually turned out that this is not an unbiased estimator for this-- and I don't want to get to technical for it right now-- that this is actually a very good estimate of this. The expected value of this is going to be this. And I'll go into more depth on expected values in the future. But it turns out that this is not quite the same expected value as this. But you don't have to worry about it for now. So why even talk about the standard deviation? Well, one, the units work out a little better. If let's say all of our data points were measured in meters, right? If we were taking a bunch of measurements of length then the units of the variance would be meter squared. right? Because we're taking meters minus meters. This would be a meter. Then you're squaring. You're getting meters squared. And that's kind of a strange concept if you say you know the average dispersion from the center is in meter squares. Well first, when you take the square root of it you get this-- you get something that's again in meters. So you're kind of saying, oh well the standard deviation is x or y meters. And then we'll learn a little bit it if you can actually model your data as a bell curve or if you assume that your data has a distribution of a bell curve then this tells you some interesting things about where all of the probability of finding someone within one or two standard deviations of the of the mean. But anyway, I don't want to go to technical right now. Let's just calculate a bunch. Let's calculate. Let's see, if I had numbers 1, 2, 3, 8, and 7. And let's say that this is a population. So what would its mean be? So I have 1 plus 2 plus 3. So it's 3 plus 3 is 6. 6 plus 8 is 14.