Placeholder Image

Subtitles section Play video

  • Hi. It's Mr. Andersen and welcome to my podcast on the Chi-squared test. Chi-squared

  • test if you look at the equation lots of students get scared right away. It's really simple

  • once you figure it out. So don't be scared away, but Chi-squared test especially in AP

  • biology, especially in science is very important. And it's a way to compare when you collect

  • data, is the variation in your data just due to chance or is it due to one of the variables

  • that you're actually testing. And so the first thing you should figure out is what are the,

  • what do all these variables mean?

  • So the first one, this right here stands for Chi-squared. And so this was developed way

  • in the early part of the 1900s by Carl Pearson. Pearson's Chi-squared test. So, what is this

  • then? That is going to be a sum. So we're going to add up a number of values in a Chi-squared

  • test. What does the O stand for? Well that's going to be for the data you actually collect.

  • And so we call that observed data. And then the E values are going to be the expected

  • values. And so if you're ever doing an experiment, you can actually figure out your expected

  • values before you start. And then you just simply compare them to your observed values.

  • Let me give you an example of that with these coins over here.

  • Let's say I flip a coin 100 times. And I get

  • 62 heads and I get 38 tails. Well is that due to just chance? Or is there something

  • wrong with the coin? Or the way that I'm flipping the coin? And so the Chi-squared test allows

  • us to actually answer that. And so what I'm thinking in my head is something called a

  • Null Hypothesis. And so if we're flipping a coin 100 times. And I think I said 62 head

  • and 38 tails. Well that would be the observed value that we get in an experiment. But there'd

  • also be expected values because you know it should be 50 heads and 50 tails. And so you

  • used something called a null hypothesis in this case where you're saying there's not

  • statistical significant difference between the observed values and the expected frequencies

  • that we expect to get and what do we actually find.

  • And so it's cool, Chi-squared, because we

  • can actually measure our data, or look at our data and see is there a statistical difference

  • between those two. The best way to get good at Chi-squared is actually to do some problems.

  • Before we get to that there's two terms that I have to define. One is degrees of freedom

  • and then one is critical values. And so the whole point of a Chi-squared test is either

  • to accept or reject our null hypothesis. And so you have to either exceed or don't exceed

  • your critical value. But first of all we have to figure out where that number is in this

  • big chart right here.

  • First thing is something called degrees of freedom. So since we're comparing outcomes,

  • you have to have at least two outcomes in your experiment. So in this case if we have

  • heads and tails, we have two outcomes that we could get, so we'll say that's 2. And then

  • we simply subtract the number 1 from that to get the degrees of freedom. And so in this

  • case we have two outcomes minus 1 and so we would have 1 degree of freedom. Now you might

  • think to yourself why isn' there a zero on this chart? Well, if you just have one outcome

  • you have nothing to compare it to. So that's an easy way to think about that. So we figured

  • out that there is one degree of freedom in this case. The next thing you're looking at

  • is for a critical value. And the critical value that we'll always use in the class is

  • the 0.05 value. And so that's going to be this column right here. So the first thing

  • you do is find the 0.05 value and you don't worry about all of the other numbers. So that's

  • 3.841 is something I just know because it means that I'm in the right chart or I'm in

  • the right column.

  • A way that I explain this to kids is that you can think of that as being 95% sure that

  • you're either accepting or rejecting your null hypothesis. And you can see that our

  • critical values get higher over here. So you can think as we move this way, if we really

  • want to be sure we'd have to exceed a higher critical value. So what's our null hypothesis

  • again. Null hypothesis's no statistical difference between observed and expected and so we either

  • accept or reject that value. So in this case our critical value would be 3.841. And so

  • when you calculate Chi-squared, if you get a number that is higher than 3.841 then you

  • reject that null hypothesis. And so there actually is something aside from just chance

  • that is causing you to get more heads than tails. And if you don't exceed the critical

  • value then you accept that null hypothesis. And this is usually what ends up happening,

  • unless you have a variable that's impacting your results. Let's apply this in a couple

  • of different cases.

  • So this is my wife here. I asked her to flip a coin and so I asked the statistics teacher

  • how much data do you have to get before you can actually apply the Chi-squared test? And

  • Mr. Humberger said something magic about 30. And so I want to exceed that number in each

  • of these experiments and so this is my wife down here. This is her hand. And what she's

  • going to do is she's going to, let me get a value you can see, she's going to flip 50

  • coins. You can see she's really fast so she's flipping 50 coins and then she's sorting them

  • out. And so if we look at that, the first thing, even before you collect the data is

  • we could look at the expected values. And so we've got heads or tails. And so if you

  • flip 50 coins how many do we expect to come up as heads? The right answer would be 25.

  • And how many would we expect to come up as tails? 25 as well. Now let's say your data

  • is not as even as that. If you're looking at fruit flies it might be 134 or 133. Well

  • let's say I flip 51 coins for example instead of 50 then my expected values would be 25.5

  • and 25.5. So expected values since they're just due to probability don't have to be a

  • whole number.

  • If we look at our observed values, well let's look down here. How many heads did we get?

  • 28 heads. And how many tails did we get? So that would just be 22. Okay. So now we're

  • going to apply Chi-squared and come up with a critical value. And so, what does that mean?

  • Well let me get this out of the way. So we're going to take our equation which is O minus

  • E squared over E, and we're going to do that for the heads column and then we're going

  • to do it for the tails column. So we've also got O minus E squared over E for the tails

  • column. And so our observed value is going to be 28. So it's 28 minus 25, which is expected,

  • squared over 25. Now this sum means that we're going to add these two values together so

  • I'm going to put a plus sign right here. Now we're going to do the tails side. So what's

  • our observed? It's 22 minus 25 squared over 25. So you can do this in your head. 28 minus

  • 25 is 3, square that is 9. 9 over 25 plus 22 minus 25 is negative 3 squared. It's 9

  • over 25. And so our answer is 18 over 25 which equals 0.72.

  • Okay. So that's our Chi-squared value for

  • this data that we just collected. Now let's go over here to our critical values. Well

  • we said that we had 1 degree of freedom, because there's two outcomes. 2 minus 1 is 1. So we're

  • in this right here, this row right here. And then here is our magical 0.05 column and so

  • our critical value is 3.841. And so if we get a number higher than that we reject our

  • null hypothesis. We didn't, so we got a value that is lower than that, 0.72 so that means

  • we have to accept our null hypothesis. That means that my wife did a great job. There's

  • nothing wrong with the coins. There's not way more heads then there should be and so

  • we have to accept the null hypothesis that there's no statistical difference between

  • what we observe and what we expect to see.

  • So now let's try a little more complex problem. Now we've got dice. So we've got 36 dice.

  • So let me get this out here. So our expected values, well there are six things you could

  • get. So we could get a 1, 2, 3, 4, 5 or 6. And so let's play this out. So expected values,

  • since I have 36 dice here, we would expect to get 6 of each of those numbers coming up.

  • So I'm just taking 36 total dice divided by 6 so I got 6. But let's see what we get for

  • observed values. Oh, it looks like we're getting a lot of sixes. So if we look at the observed

  • values for one here we get 2 ones. We look at the twos, we get 4 of those. For the threes

  • it looks like 8 threes. For the fours we get 9. For the fives we just get 3. And then for

  • the sixes, look at all the sixes, so we get 10 right here. Okay. Now we have to figure

  • out a Chi-squared value. So let me get this out of the way.

  • And I'm going to stop talking and do the math

  • and speed up the video a little bit. And so hopefully I don't screw up any of this. So

  • that is 58 over 6 which is 9.6. So that is our Chi-squared value. It's 9.6 in this case.

  • Since we added all these up. So now we've got to go over here to our chart. And so first

  • of all we have to figure out how many degrees of freedom do we have. Well, since there are

  • 6 different outcomes and we take 6 minus 1, so we've got 5. We're in this column of the

  • 0.05 right here so if I read across our critical value is 11.070. And so if we look at that,

  • did our value go higher than that, no it's only 9.6, it's lower than that, so in this

  • case since it's 9.6, even though we had all of those sixes we still need to accept our

  • null hypothesis that there's no statistical significance between or difference between

  • what we observed and then what we expected.

  • So now let's leave you with this question. So in the animal behavior podcast as I talk

  • about that, we're looking at pill bugs and if they spend more time in the wet or if they

  • spend more time in the dry. And so if you look at the values right here, this is recording

  • how much time they spend in the wet and how much time they spend in the dry. So what I've

  • done is we would expect since there are 10 pill bugs we'd have 5 on each side. But since

  • it looks like they're spending more time on the wet, you can even see them in the video

  • here spending more time in the wet, I take the average of the wet and the average of

  • the dry column. And that gives me my wet and my dry and so now I'm not going to show you

  • how to do this one, but try to apply Chi-squared to figure out if there's a statistical difference

  • between the expected values of what we expect and what we observed. And you can put your

  • answer down in the comments. And so I hope that's helpful.

Hi. It's Mr. Andersen and welcome to my podcast on the Chi-squared test. Chi-squared

Subtitles and vocabulary

Click the word to look it up Click the word to find further inforamtion about it