Placeholder Image

Subtitles section Play video

  • Let's review a little bit of everything we learned so far

  • and hopefully it'll make everything fit together

  • a little bit better.

  • Then we'll do a bunch of calculations with real numbers

  • and I think it'll really hit the point home.

  • So, first of all if we're dealing with a-- let me

  • actually write down, let me make some columns.

  • So if we're dealing with-- let's see, we could call it the

  • concept and then we'll call it whether we're dealing with

  • a population or a sample.

  • So the first statistical concept we came up with was the

  • notion of the mean or the central tendency and we learned

  • of that was one way to measure the average or central

  • tendency of a data set.

  • The other ways were the median and the mode.

  • But the mean tends to show up a lot more, especially when we

  • start talking about variances and, as we'll do in this video,

  • the standard deviation.

  • But the mean of a population we learned-- we use the greek

  • letter Mu-- is equal to the sum of each of the data points

  • in the population.

  • That's an i.

  • Let me make sure it looks like an I.

  • So you're going to sum up each of those data points.

  • You're going to start with the first one and you're going

  • to go to the nth one.

  • We're assuming that there are n data points in the population.

  • And then you divide by the total number that you have.

  • And this is like the average that you're used to taking

  • before you learned any of the statistics stuff.

  • You add up all the data points and you divide by

  • the number there are.

  • The sample is the same thing.

  • We just use a slightly different terminology.

  • The mean of a sample-- and I'll do it in a different

  • color-- just write it as x with a line on top.

  • And that's equal to the sum of all the data

  • points in the sample.

  • So each of the xi in the sample.

  • But we're serving the sample is something

  • less than a population.

  • So you start with the first one still.

  • And then you go to the lower case n where we assume that

  • lowercase n is less than the big N.

  • If this was the same thing then we're actually taking the

  • average or we're taking the mean of the entire population.

  • And then you divide by the number of data

  • points you added.

  • You get to n.

  • Then we said OK, how far-- this give us the central tendency.

  • It's one measure of the central tendency.

  • But what if we wanted to know how good of an indicator this

  • is for the population or for the sample?

  • Or, on average, how far are the data points from this mean?

  • And that's where we came up with the concept of variance.

  • And I'll arbitrarily switch colors again.

  • Variance.

  • And in a population the variable or the notation for

  • variance is the sigma squared.

  • This means variance.

  • And that is equal to-- you take each of the data points.

  • You find the difference between that and the mean that

  • you calculate up there.

  • You square it so you get the squared difference.

  • And then you essentially take the average of all of these.

  • You take the average of all of these squared distances.

  • So that's-- so you take the sum from i is equal to 1 to

  • n and you divide it by n.

  • That's the variance.

  • And then the variance of a sample mean-- and this was a

  • little bit more interesting and we talked a little bit

  • about it in the last video.

  • You actually want to provide a-- you want to estimate the

  • variance of the population when you're taking the

  • variance of a sample.

  • And in order to provide an unbiased estimate you do

  • something very similar to here but you end up

  • dividing by n minus 1.

  • So let me write that down.

  • So the variance of a population-- I'm sorry, the

  • variance of a sample or samples variance or unbiased sample

  • variance if that's why we're going to divide by n minus 1.

  • That's denoted by s squared.

  • What you do is you take the difference between each of the

  • data points in the sample minus the sample mean.

  • We assume that we don't know the population mean.

  • Maybe we did.

  • If we knew the population mean we actually wouldn't have to do

  • the unbiased thing they were going to do here in

  • the denominator.

  • But when you have a sample the only way to kind of figure out

  • the population mean is to estimate it with sample mean.

  • So we assume that we only have the sample mean.

  • And you're going to square those and then you're going to

  • sum them up from i is equal to 1 to i is equal to n because

  • you have n data points.

  • And if you want an unbiased estimator you divide

  • by n minus 1.

  • And we talked a little bit before why you want this to be

  • a n minus 1 instead of a n.

  • And actually in a couple of videos I'll actually

  • prove this to you.

  • One, I'll prove it maybe experimentally using Excel and

  • then I'll-- which wouldn't be a proof, it'll just give you a

  • little bit of intuition-- and then I'll actually prove

  • it a little bit more formally later on.

  • But you don't have to worry about it right now.

  • The next thing we'll learn is something that you've probably

  • heard a lot of, especially sometimes in class, teachers

  • talk about the standard deviation of a test or-- it's

  • actually probably one of the most use words in statistics.

  • I think a lot of people unfortunately maybe use it or

  • maybe use it without fully appreciating everything

  • that it involves.

  • But the goal we'll eventually hopefully appreciate

  • all that involves soon.

  • But the standard deviation-- and once you know variance it's

  • actually quite straightforward.

  • It's the square root of the variance.

  • So the standard deviation of a population is written as sigma

  • which is equal to the square root of the variance.

  • And now I think you understand why a variance is written

  • as sigma squared.

  • And that is equal to just the square root of all that.

  • It's equal to the square root-- I'll probably run out of

  • space-- of all of that.

  • So the sum-- I won't write at the top or the bottom, that

  • makes it messy-- if xi minus Mu squared, everything over n.

  • And then if you wanted the standard deviation of a

  • sample-- and it actually gets a little bit interesting because

  • the standard deviation of a sample, which is equal to the

  • square root of the variance of a sample-- it actually turned

  • out that this is not an unbiased estimator for this--

  • and I don't want to get to technical for it right now--

  • that this is actually a very good estimate of this.

  • The expected value of this is going to be this.

  • And I'll go into more depth on expected values in the future.

  • But it turns out that this is not quite the same

  • expected value as this.

  • But you don't have to worry about it for now.

  • So why even talk about the standard deviation?

  • Well, one, the units work out a little better.

  • If let's say all of our data points were measured

  • in meters, right?

  • If we were taking a bunch of measurements of length then

  • the units of the variance would be meter squared.

  • right?

  • Because we're taking meters minus meters.

  • This would be a meter.

  • Then you're squaring.

  • You're getting meters squared.

  • And that's kind of a strange concept if you say you know the

  • average dispersion from the center is in meter squares.

  • Well first, when you take the square root of it you get

  • this-- you get something that's again in meters.

  • So you're kind of saying, oh well the standard deviation

  • is x or y meters.

  • And then we'll learn a little bit it if you can actually

  • model your data as a bell curve or if you assume that your data

  • has a distribution of a bell curve then this tells you some

  • interesting things about where all of the probability of

  • finding someone within one or two standard deviations

  • of the of the mean.

  • But anyway, I don't want to go to technical right now.

  • Let's just calculate a bunch.

  • Let's calculate.

  • Let's see, if I had numbers 1, 2, 3, 8, and 7.

  • And let's say that this is a population.

  • So what would its mean be?

  • So I have 1 plus 2 plus 3.

  • So it's 3 plus 3 is 6.

  • 6 plus 8 is 14.

  • 14 plus 7 is 21.

  • So the mean of this population-- you sum up

  • all the data points.

  • You get 21 divided by the total number of data

  • points, 1, 2, 3, 4, 5.

  • 21 divided by 5 which is equal to what?

  • 4.2.

  • Fair enough.

  • Now we want to figure out the variance.

  • And we're assuming that this is the entire population.

  • So the variance of this population is going to be equal

  • to the sum of the squared differences of each of

  • these numbers from 4.2.

  • I'm going to have to get my calculator out.

  • So it's going to be 1 minus 4.2 squared plus 2 minus 4.2

  • squared plus 3 minus 4.2 squared plus 8 minus

  • 4.2 squared plus 7 minus 4.2 squared.

  • And it's going to be all of that-- I know it looks a little

  • bit funny-- divided by the number of data points we

  • have-- divided by 5.

  • So let me take the calculator out.

  • All right.

  • Here we go.

  • Actually maybe I should have used the graphing

  • calculator that I have.

  • Let me see if I can get this thing-- if I could get this.

  • There you go.

  • Yeah, I think the graphing one will be better because

  • I can see everything that I'm writing.

  • OK, so let me clear this.

  • So I want to take 1 minus 4.2 squared plus 2 minus 4.2

  • squared plus 3 minus 4.2 squared plus 8 minus 4.2

  • squared, where I'm just taking the sum of the squared

  • distances from the mean squared, one more, plus

  • 7 minus 4.2 squared.

  • So that's the sum.

  • The sum is 38.8.

  • So the numerator is going to be equal to 38.8 divided by 5.

  • So this is the sum of the squared distances, right?

  • Each of these-- just so you can relate to the formula-- each

  • of that is xi minus the mean squared.

  • And so if we take the sum of all of them-- this numerator is

  • the sum of each of the xi minus the mean squared from

  • i equals 1 to n.

  • And that ended up to be 38.8.

  • And I just calculated like that.

  • I just took each to the data points minus the mean

  • squared, add them all up, and I got 38.8.

  • And I went and divided by n which is 5.

  • So this n up here is actually also 5.

  • Right?

  • And so 38.8 divided by 5 is 7.76.

  • So the variance-- let me scroll down a little bit-- the

  • variance is equal to 7.76.

  • Now if this was a sample of a larger distribution, if this

  • was a sample-- if the 1, 2, 3, 8, and 7, weren't the

  • population-- if it was a sample from a larger population,

  • instead of dividing by 5 we would have divided by 4.

  • And we would have gotten the variance as 38.8 divided by n

  • minus 1, which is divided by 4.

  • So then we would have gotten the variance-- we would have

  • gotten the sample variance 9.7 if you divided by n

  • minus 1 instead of n.

  • But anyway, don't worry about that right now.

  • That's just a change of n.

  • But once you have the variance, it's very easy to figure

  • out the standard deviation.

  • You just take the square root of it.

  • The square root of 7.76-- 2.78.

  • Let's say 2.79 is the standard deviation.

  • So this gives us some measure of, on average, how far

  • the numbers are away from the mean which was 4.2.

  • And it gives it in kind of the units of the

  • original measurement.

  • Anyway, I'm all out of time.

  • I'll see you in the next video.

  • Or actually, let's figure out-- we said if this was a sample,

  • if those numbers were sample and not the population, that

  • we figured out that the sample variance was 9.7.

  • And so then the sample standard deviation is just going to

  • be the square root of that.