 ## Subtitles section Play video

• Hi, I'm Adriene Hill, and welcome back to Crash Course Statistics.

• When you're using the patterns of probability to predict events in your life, it can be

• nice to have some mathematical shortcuts.

• That way, when you're reading People Magazine, you won't have to spend so much time sidetracked,

• calculating how likely it is that Harry and Meghan will have 3 boys and 2 girls if they

• have 5 kids.

• And you won't have to pause Willy Wonka and the Chocolate Factory, right as Grandpa

• Joe starts singing, to figure out how likely it is that you could have found one Golden

• Ticket by opening 4 chocolate bars.

• Today, we'll introduce you to some of these shortcuts so you can get back to the movie.

• Before we start today, heads up -- these shortcuts are equations.

• And we try in this series not to spend too much time plugging numbers into equations,

• but today we felt like we kinda needed to.

• Because understanding these things is important.

• So to sweeten the deal, we added some zombies.

• INTRO

• You're in your kitchen, making a piece of toast with your old toaster--which has seen

• better days.

• You've figured out that each time you make a piece of toast you have a 20% chance of

• being shocked--not lethally, but painfully.

• Alright so maybe it's time to get another toaster.

• But you haven't had a chance.

• Anyway you eat toast every weekday morning (you eat pancakes on the weekends), and you're

• wondering how many shocks you'll get this week.

• It's a stressful week and you've decided that toast is only worth the risk

• if you'll probably get shocked only once.

• You know the multiplication rule of probability, so you can calculate this.

• There are five different ways you can receive only one shock this week: you either get shocked

• once on Monday, Tuesday, Wednesday, Thursday, or Friday and then remain shock free on the

• 4 other days of the week.

• If we represent a shock with an X and a non-shock day with an O, the possibilities for your

• week look something like this.

• Now we need to calculate the probability of getting one shock and four non-shocks using

• the multiplication rule.

• Let's look at the probability of only getting shocked on Monday.

• The probability of getting shocked is 20%, so the probability of not getting shocked

• on Tuesday is 80% and similarly on Wednesday through Friday there's an 80% chance each

• day of not getting shocked.

• So the probability of getting shocked on Monday and not on Tuesday-Friday is

• 0.2 x 0.8 x 0.8 x 0.8 x 0.8 = ~0.082.

• That's about an 8.2% chance that you'll get shocked on Monday and not for the rest

• of the week, so now we've got to calculate the rest of the one-shock options.

• The probability of getting shocked only on Tuesday is the same, about 8.2%, since order

• doesn't matter in multiplication.

• The probability of the Monday-only option or the Tuesday-only option or any of the remaining

• 3 options can be calculated using the addition rule.

• You have 8.2% + 8.2%+ 8.2%+ 8.2%+ 8.2% also known as 5 x 8.2% chance of just one shock.

• That's a 41% chance of getting shocked only once this work week, so you decide to risk it.

• That was a lot of work just to figure out whether it's a good idea to have toast,

• and thankfully there's a more compact formula.

• The formula, called the Binomial Distribution formula, takes the math we just did and simplifies it.

• In our toast example, we first figured out the probability of only getting shocked once.

• To do this we multiplied each day's probability together.

• Let's use exponents to make this formula a little bit shorter.

• We can combine all the 0.2s and all the 0.8s to give us this.

• Notice that the two exponents add up to 5--the total number of days.

• This formula works for finding out the chance of getting shocked only once, but we can also

• use it to find out the chance of getting shocked another number of times.

• In general the formula looks like this:

• For example, the probability of getting shocked only on Tuesday and Wednesday would be this.

• = ~0.02 or 2%

• We also need to account for the number of ways that getting shocked once or twice in

• a week can happen.

• To do this, we'll use a very useful formula--the Binomial Coefficient Formula--from a field

• of math called Combinatorics.

• The Binomial Coefficient formula makes it easy for us to find out how many ways a certain

• ratio of successes--not getting shocked--to failures--getting shocked, can occur.

• In a general form, it looks like this: (n-Choose-k) but you also can read it aswe have n things,

• count the number of different ways we can choose k of them”.

• For our toast example (5-Choose-1 is 5).

• That is there are 5 different ways we can only receive one shock from our toaster this week.

• The math behind this snazzy formula uses a lot of factorials which are the product of

• an integer and all the integers less than it, and looks like this:

• You remember factorials because they're the ones that look like they're always shouting.

• See those exclamation marks?!

• in the description.

• Now we have all the pieces of our binomial distribution formula, let's put them together.

• First the Binomial Coefficient which tells us how many ways we can have one shock and

• four non-shocks, and then our shorthand multiplication of probabilities.

• We put it together and we have a full blown formula for calculating the probability of

• getting shocked on one out of five days this week, about 40%.

• It took us a while to get here, but we now have a general formula for calculating any

• similar probability.

• If p is the probability of our event happening then here's our formula:

• For example, supposing that there is an equal chance of having a boy or girl, if we want

• to find out the probability of a couple having 3 girls out of 5 children, we can simply plug

• our numbers into the equation.

• We do that and we see there's about a 31% chance of having 3 of their 5 children be

• girls.

• And now we get to the zombies: it's the beginning of the zombie apocalypse and you, thankfully, still

• But there's a bunch of people between you and the nearest shelter, and you want to know

• how likely it is that none of them have been bitten and infected with the zombie virus.

• Since you can't always tell if a person is infected right away, you decide to use

• It's early stages, so right now there's only about a 5% infection rate in the population.

• That means the probability of someone being a zombie is about 5%, and the probability

• of not being infected is still about 95%.

• Peeking out at the crowd that stands between you and the shelter you count 20 people.

• Plugging these numbers into our formula, we can see that there's about a 36% chance

• none of the people you'd encounter on your run to safety will try to eat your brains.

• Those are pretty good odds, and you're fast.

• You know that you could probably safely reach the shelter even if there were one or two

• zombies, so let's calculate those probabilities as well.

• There's about a 36% chance of meeting no zombies, about a 37% chance of having only

• one--easily dodgeable-- zombie, and about an 18% chance of having to out maneuver only

• two, that means that cumulatively, that's about a 91% chance that you'll be able to

• sprint your way to safety.

• We could calculate the probability for every possible number of zombies from 0 to 20.

• If we did that, we would get the discrete distribution for this specific type of problem

• where there are 20 events, and the probability of the event of interest--here, the zombie

• infection rate--is 5 %.

• But in a more general sense, I also want to know how many zombies I would expect to confront

• on average - just so I could prepare.

• I could try to guess the mean by looking at the graph of the binomial distribution for

• all of the possible cases.

• Just by eyeballing it, I'd say the mean of this distribution is around 1, and it is!

• The actual formula to find the mean of a binomial distribution is n--the number of events we're

• looking at--times p--the probability of what we're interested in, the probability of

• being a zombie.

• Since the probability of being a zombie is 5%, it makes sense that on average, about

• 5% of any population will be zombies.

• Since we have a group of 20, we expect about 5% of 20--or 1--zombie infection on average.

• While zombies are probably not on your day to day list of concerns, this kind of calculation

• might be.

• Important public health issues such as the spread of pandemic-level viruses can been

• modeled using a similar approach.

• We could just as easily calculate the probability that of the 40 people you shook hands with

• at that Zombie Apocalypse meet-up, 2 or fewer had the cold that has been going around.

• Or we could calculate the mean number of cold infected people you could expect to shake

• hands with at any given meeting.

• If the probability of having a cold at that given moment in the population is 20%, the

• probability of only 2 or fewer people having a cold is only about 0.7%.

• From the binomial distribution with 40 people and a 20% chance of having a cold we can see

• that it's MUCH more likely that more than 2 people will have the sniffles, in fact you'd

• expect 20% of 40 --or 8 people--to have a cold at any similar meeting with 40 people.

• A special case of the Binomial Distribution with only one trial--or just one event, a

• single coin flip, for example--is called the Bernoulli Distribution.

• It represents the probability of of getting either a success or a failure.

• Our outcome--x--can either be a 0 (failure) or a 1 (success).

• The general formula for the Bernoulli distribution where p is the probability of success is this.

• And by plugging in 1 or 0, we can see that the probability that our outcome is a success

• is p, and the probability of a failure is (1-p).

• For example, if the probability of rolling an odd prime number ( 3 or 5) on a single

• dice roll is about 33.33%, then the bernoulli distribution for rolling a prime number would

• be the probability of rolling an odd prime to the x power multiplied by the probability

• of not rolling an odd prime to the one minus x power.

• When x is 0--a failure-- then the probability is 1 minus one third or around 66.66%, and

• when x is 1--a success--the probability is around 33.33%.

• Though relatively simple, the bernoulli formula is a useful building block.

• For instance, you can think of the binomial probabilities that we did as many bernoulli

• trials one after another.

• And in reality, that's what they are!

• Alright let's get back to the zombies.

• Things have gotten much much worse.

• Zombies are everywhere - you're injured really badly.

• You need a blood infusion and need so much blood that you'll need some from all 3 of

• the people around you, luckily you're type AB positive so you can get blood from any

• blood type, but you have something else to worry about: the latent zombie virus.

• All three of your buddies seem okay, but you heard on the radio the population now has

• a 30% infection rate.

• So there's a 30% chance they're infected with the zombie virus and are just asymptomatic.