Name: Chi-squared Test
Uploaded: 2013-03-25T05:10:15.000Z
Duration: 11 min 53 s
Description: Thousands of YouTube videos with English-Chinese subtitles! Now you can learn to understand native speakers, expand your vocabulary, and improve your pronunciation...

Hi. It's Mr. Andersen and welcome to my podcast on the Chi-squared test. Chi-squared

The imperative

test if you look at the equation lots of students get scared right away. It's really simple

once you figure it out. So don't be scared away, but Chi-squared test especially in AP

biology, especially in science is very important. And it's a way to compare when you collect

data, is the variation in your data just due to chance or is it due to one of the variables

that you're actually testing. And so the first thing you should figure out is what are the,

So the first one, this right here stands for Chi-squared. And so this was developed way

in the early part of the 1900s by Carl Pearson. Pearson's Chi-squared test. So, what is this

then? That is going to be a sum. So we're going to add up a number of values in a Chi-squared

test. What does the O stand for? Well that's going to be for the data you actually collect.

And so we call that observed data. And then the E values are going to be the expected

values. And so if you're ever doing an experiment, you can actually figure out your expected

values before you start. And then you just simply compare them to your observed values.

Let me give you an example of that with these coins over here.

Let's say I flip a coin 100 times. And I get

62 heads and I get 38 tails. Well is that due to just chance? Or is there something

wrong with the coin? Or the way that I'm flipping the coin? And so the Chi-squared test allows

us to actually answer that. And so what I'm thinking in my head is something called a

Null Hypothesis. And so if we're flipping a coin 100 times. And I think I said 62 head

and 38 tails. Well that would be the observed value that we get in an experiment. But there'd

also be expected values because you know it should be 50 heads and 50 tails. And so you

used something called a null hypothesis in this case where you're saying there's not

statistical significant difference between the observed values and the expected frequencies

that we expect to get and what do we actually find.

And so it's cool, Chi-squared, because we

can actually measure our data, or look at our data and see is there a statistical difference

between those two. The best way to get good at Chi-squared is actually to do some problems.

Before we get to that there's two terms that I have to define. One is degrees of freedom

and then one is critical values. And so the whole point of a Chi-squared test is either

to accept or reject our null hypothesis. And so you have to either exceed or don't exceed

your critical value. But first of all we have to figure out where that number is in this

First thing is something called degrees of freedom. So since we're comparing outcomes,

you have to have at least two outcomes in your experiment. So in this case if we have

heads and tails, we have two outcomes that we could get, so we'll say that's 2. And then

we simply subtract the number 1 from that to get the degrees of freedom. And so in this

case we have two outcomes minus 1 and so we would have 1 degree of freedom. Now you might

think to yourself why isn' there a zero on this chart? Well, if you just have one outcome

you have nothing to compare it to. So that's an easy way to think about that. So we figured

out that there is one degree of freedom in this case. The next thing you're looking at

is for a critical value. And the critical value that we'll always use in the class is

the 0.05 value. And so that's going to be this column right here. So the first thing

you do is find the 0.05 value and you don't worry about all of the other numbers. So that's

3.841 is something I just know because it means that I'm in the right chart or I'm in

A way that I explain this to kids is that you can think of that as being 95% sure that

you're either accepting or rejecting your null hypothesis. And you can see that our

critical values get higher over here. So you can think as we move this way, if we really

want to be sure we'd have to exceed a higher critical value. So what's our null hypothesis

again. Null hypothesis's no statistical difference between observed and expected and so we either

accept or reject that value. So in this case our critical value would be 3.841. And so

when you calculate Chi-squared, if you get a number that is higher than 3.841 then you

reject that null hypothesis. And so there actually is something aside from just chance

that is causing you to get more heads than tails. And if you don't exceed the critical

value then you accept that null hypothesis. And this is usually what ends up happening,

unless you have a variable that's impacting your results. Let's apply this in a couple

So this is my wife here. I asked her to flip a coin and so I asked the statistics teacher

how much data do you have to get before you can actually apply the Chi-squared test? And

Mr. Humberger said something magic about 30. And so I want to exceed that number in each

of these experiments and so this is my wife down here. This is her hand. And what she's

going to do is she's going to, let me get a value you can see, she's going to flip 50

coins. You can see she's really fast so she's flipping 50 coins and then she's sorting them

out. And so if we look at that, the first thing, even before you collect the data is

we could look at the expected values. And so we've got heads or tails. And so if you

flip 50 coins how many do we expect to come up as heads? The right answer would be 25.

And how many would we expect to come up as tails? 25 as well. Now let's say your data

is not as even as that. If you're looking at fruit flies it might be 134 or 133. Well

let's say I flip 51 coins for example instead of 50 then my expected values would be 25.5

and 25.5. So expected values since they're just due to probability don't have to be a

If we look at our observed values, well let's look down here. How many heads did we get?

28 heads. And how many tails did we get? So that would just be 22. Okay. So now we're

going to apply Chi-squared and come up with a critical value. And so, what does that mean?

Well let me get this out of the way. So we're going to take our equation which is O minus

E squared over E, and we're going to do that for the heads column and then we're going

to do it for the tails column. So we've also got O minus E squared over E for the tails

column. And so our observed value is going to be 28. So it's 28 minus 25, which is expected,

squared over 25. Now this sum means that we're going to add these two values together so

I'm going to put a plus sign right here. Now we're going to do the tails side. So what's

our observed? It's 22 minus 25 squared over 25. So you can do this in your head. 28 minus

25 is 3, square that is 9. 9 over 25 plus 22 minus 25 is negative 3 squared. It's 9

over 25. And so our answer is 18 over 25 which equals 0.72.

Okay. So that's our Chi-squared value for

this data that we just collected. Now let's go over here to our critical values. Well

we said that we had 1 degree of freedom, because there's two outcomes. 2 minus 1 is 1. So we're

in this right here, this row right here. And then here is our magical 0.05 column and so

our critical value is 3.841. And so if we get a number higher than that we reject our

null hypothesis. We didn't, so we got a value that is lower than that, 0.72 so that means

we have to accept our null hypothesis. That means that my wife did a great job. There's

nothing wrong with the coins. There's not way more heads then there should be and so

we have to accept the null hypothesis that there's no statistical difference between

what we observe and what we expect to see.

So now let's try a little more complex problem. Now we've got dice. So we've got 36 dice.

So let me get this out here. So our expected values, well there are six things you could

get. So we could get a 1, 2, 3, 4, 5 or 6. And so let's play this out. So expected values,

since I have 36 dice here, we would expect to get 6 of each of those numbers coming up.

So I'm just taking 36 total dice divided by 6 so I got 6. But let's see what we get for

observed values. Oh, it looks like we're getting a lot of sixes. So if we look at the observed