Name: The Normal Distribution: Crash Course Statistics #19
Uploaded: 2020-03-30T04:33:42.000Z
Duration: 11 min 27 s
Description: Thousands of YouTube videos with English-Chinese subtitles! Now you can learn to understand native speakers, expand your vocabulary, and improve your pronunciation...

Hi, I'm Adriene Hill, and Welcome back to Crash Course Statistics.

This is the episode you've been waiting for. The episode we designed this shelf for. The episode that

you have heard a lot about. (NORMAL DIST MONTAGE)

Well, today, we'll get to see why we talk SO MUCH about the normal distribution.

Things like height, IQ, standardized test scores, and a lot of mechanically generated

things like the weight of cereal boxes are normally distributed, but many other interesting

things from blood pressure, to debt, to fuel efficiency just aren't.

One reason we talk so much about normal distributions is because distributions of means are normally

The normal distribution is symmetric, which means its mean, median and mode are all the

same value. And it's most popular values are in the middle, with skinny tails to either side.

In general, when we ask scientific questions, we're not comparing individual scores or

values like the weight of one blue jay, or the number of kills from one League of Legends game,

we're comparing groups--or samples--of them. So we're often concerned with the

distributions of the means, not the population.

In order to meaningfully compare whether two means are different, we need to know something

about their distribution: the sampling distribution of sample means. Also called the sampling

And before we go any further, I want to say that the distribution of sample means is not

something we create, we don't actually draw an infinite number of samples to plot and

observe their means. This distribution, like most distributions, is a description of a process.

Take income. Income is skewed….so we might think the distribution of all possible mean

incomes would also be skewed. But they're actually normally distributed.

In the real population there are people that make a huge amount of money. Think Oprah,

Jeff Bezos, and Bill Gates. But when we take the mean of a group of three randomly selected

people, it becomes much less likely to see extreme mean incomes because in order to have

an income that's as high as Oprah's, you'd need to randomly select 3 people with pretty

Since scientific questions usually ask us to compare groups rather than individuals,

this makes our lives a lot easier, because instead of an infinite amount of different

distributions to keep track of, we can just keep track of one: the normal distribution.

The reason that sampling distributions are almost always normal is laid out in the Central

Limit Theorem. The Central Limit Theorem states that the distribution of sample means for

an independent, random variable, will get closer and closer to a normal distribution

as the size of the sample gets bigger and bigger, even if the original population distribution

As we get further into inferential statistics and making models to describe our data, this

will become more useful. Many inferential techniques in statistics rely on the assumption

that the distribution of sample means is normal, and the Central Limit Theorem allows us to

Let's look at a simulation of the Central Limit Theorem in action.

For our first example, imagine a discrete, uniform distribution. Like dice rolls. The

distribution of values for a single dice roll looks like this:

With a sample size of 1--the regular distribution of dice values--there's one way to get a

1, one way to get a 2, one way to get a 3….and so on.

But we want to look at the mean of say...2 dice rolls, meaning our sample size is 2.

With two dice. Let's first look at all the sums of the dice rolls we can get:

There's only one way to get 2 and 12, either two ones, or two 6's, but there's 6 ways

to get 7, [1,6],[2,5], [3,4] or [6,1],[5,2], and [4,3]...which lends significance to the

number 7 - which is the number you'll roll most often.

But back to means, we have the possible sums, but we want the mean, so we'll divide each

total value by two, giving us this distribution:

Even though our population distribution is uniform, The distribution of sample means

is looking more normal, even with a sample size of 2. As our sample size gets bigger

and bigger, the middle values get more common, and the tail values are less and less common.

We can use the multiplication rule from probability to see why that happens.

If you roll a die one time, the probability of getting a 1--the lowest value--is ⅙.

When you increase the number of rolls to two, the probability of getting a mean of 1, is

now 1/36, or ⅙ times ⅙, since you have to get two 1's to have a mean of 1.

Getting a mean value of 2 is a little bit easier since you can have a mean roll of 2

both by rolling two 2's, but also by rolling a 3 and a 1, or a 1 and a 3. So the probability

If we had the patience to roll a die 20 times, the probability of getting a mean roll value

of 1 would be (⅙)^20 since the only way to get a mean of 1 on 20 dice rolls is to

roll a one. Every. Single. Time. So you can see that even with a sample size of only 20,

the means of our dice rolls will look pretty close to normal.

The mean of the distribution of sample means is 3.5, the same as the mean of our original

uniform distribution of dice rolls, and this is always true about sampling distributions:

Their mean is always the same as the population they're derived from. So with large samples,

the sample means will be a pretty good estimate of the true population mean.

There are two separate distributions we're talking about.

There is the original population distribution that's generating each individual die roll,

and there is a distribution of sample means that tells you the frequency of all the possible

sample means you could get by drawing a sample of a certain size--here 20--from that original

population distribution. Again, population distribution. And sampling distribution of sample means.

But while the mean of the distribution of sample

means is the same as the population's, it's standard deviation is not, which might be

intuitive since we saw how larger sample sizes render extreme values--like a mean roll value

of 1 or 6--very unlikely, while making values close to the mean more and more likely.

And it doesn't just work for uniform population distributions. Normal population distributions

also give normal distributions of sample means, as do skewed distributions, and this weird looking guy:

In fact, with a large sample, any distribution with finite variance will have a distribution

of sample means that is approximately normal.

This is incredibly useful. We can use the nice, symmetric and mathematically pleasant

normal distribution to calculate things like percentiles, as well as how weird or rare

a difference between two sample means actually is.

The standard deviation of a distribution of sample means is still related to the original

standard deviation. But as we saw, the bigger the sample size, the closer your sample means

are to the true population mean, so we need to adjust the original population standard

deviation somehow to reflect this. The way we do it mathematically is to divide

by the square root of n--our sample size.

Since we divide by the square root of n, as n gets big, the standard deviation--or sigma--gets

smaller.. which we can see in these simulations of sampling distributions of size 20, 50,

and 100. The larger the sample size, the skinnier the distribution of sample means.

For example, say you grab 5 boxes of strawberries at your local grocery store--you're making

the pies for a pie eating contest--and weigh them when you get home. The mean weight of

a box of strawberries from your grocery store is 15oz.

But that means that you don't have quite enough strawberries. You thought that the