WEBVTT mathematics/statistics/son
00:00:00.000 --> 00:00:01.600
Hi and welcome to www.educator.com.
00:00:01.600 --> 00:00:04.100
We are going to be talking about independent events today.
00:00:04.100 --> 00:00:12.700
We just covered conditional probability and independent events have a lot to do with conditional probability.
00:00:12.700 --> 00:00:15.100
We are going to look at how they relate to each other.
00:00:15.100 --> 00:00:19.000
We are going to actually define what is an independent event is mathematically.
00:00:19.000 --> 00:00:28.200
Then we are going to modify the multiplication rule for a conditional probability for independent events.
00:00:28.200 --> 00:00:30.800
First thing is first.
00:00:30.800 --> 00:00:37.100
Independent events and how that fits together with conditional probability.
00:00:37.100 --> 00:00:47.500
You can think about non independent events, non independent events means that if knowledge of one of the events of the out coming events affects
00:00:47.500 --> 00:00:51.100
the probability that the other events occurs.
00:00:51.100 --> 00:00:58.000
If I know that they are male does it affect my estimate of whether they own a lot or not.
00:00:58.000 --> 00:01:05.600
If I know that this person is obese does it affect my estimate if they have heart disease?
00:01:05.600 --> 00:01:13.600
Those are what we come down independent events, knowing one thing it changes your estimate of the second event.
00:01:13.600 --> 00:01:32.100
Here what you could think about is this, conditional probabilities for knowing that given this person is obese that will change your estimate of heart disease versus
00:01:32.100 --> 00:01:44.400
if you know that this person is not obese these 2 estimates of heart disease will be different.
00:01:44.400 --> 00:01:51.100
They will be not equal to each other.
00:01:51.100 --> 00:02:02.500
That is what we mean by the conditional probabilities are different because if you know one if the conditions that will change your estimate for the other event.
00:02:02.500 --> 00:02:04.500
What are independent events?
00:02:04.500 --> 00:02:13.500
This means that knowledge of one event does not change or affect the probability of the other event occurs.
00:02:13.500 --> 00:02:17.200
Here the conditions of probabilities are the same.
00:02:17.200 --> 00:02:29.500
The probability of heart disease given the kind of car you drive equals red.
00:02:29.500 --> 00:02:51.700
If you drive a red car versus the probability of heart disease, this probability should be the same because car color has nothing to do with heart disease.
00:02:51.700 --> 00:03:11.200
You might say this is independent, car color does not have any varying in my estimate of probability of you having a heart disease.
00:03:11.200 --> 00:03:16.100
Let us talk about this a little bit more mathematically.
00:03:16.100 --> 00:03:20.800
Let us use this example of obesity and gender.
00:03:20.800 --> 00:03:25.700
Here is obesity on this side and gender on this side.
00:03:25.700 --> 00:03:40.700
Does the probability of being male or female change whether you might be obese?
00:03:40.700 --> 00:03:48.900
Does knowing whether somebody is male or female, does knowing their gender affect the probability that they might be obese?
00:03:48.900 --> 00:04:00.800
It turns out that the probability of being obese given that is 20 out of 100, that is the condition of probability.
00:04:00.800 --> 00:04:07.000
We are only looking at the square root.
00:04:07.000 --> 00:04:17.100
The probability of obese given in a female is also 20 out of 100.
00:04:17.100 --> 00:04:23.600
Here you could see that these probabilities equal each other.
00:04:23.600 --> 00:04:32.100
The probability of being obese for male or female are the same.
00:04:32.100 --> 00:04:35.400
How about education?
00:04:35.400 --> 00:04:48.700
Is the probability of being obese given that they have a post high school education is that going to be different than the probability of being obese given
00:04:48.700 --> 00:04:53.300
that they only have a high school education.
00:04:53.300 --> 00:05:04.300
The probability of obesity given post high school is 20 over 100.
00:05:04.300 --> 00:05:07.400
This is my universe is 20/100.
00:05:07.400 --> 00:05:19.100
That probability of being obese given high school only is 30 out of 100.
00:05:19.100 --> 00:05:29.300
It is higher for people who have not have post high school education than people who have had post high school education.
00:05:29.300 --> 00:05:46.400
These conditional probabilities do not mean that this causes this, it is just knowing one fact about these person helps you estimate their obesity probability differently.
00:05:46.400 --> 00:05:54.500
Here you can see these are not equal to each other.
00:05:54.500 --> 00:05:57.800
Let us define what are independent event is.
00:05:57.800 --> 00:06:00.700
We have already talked about this.
00:06:00.700 --> 00:06:12.100
One way to define it is that the probability of A given B is equal to the probability of A given not B.
00:06:12.100 --> 00:06:14.000
Those 2 equal each other.
00:06:14.000 --> 00:06:16.900
It does not matter whether B occurs or not.
00:06:16.900 --> 00:06:21.600
That is what I have written down here.
00:06:21.600 --> 00:06:24.300
There is another way that you could think about this.
00:06:24.300 --> 00:06:32.500
The probability of A given B is equal to just the probability of A in any circumstances.
00:06:32.500 --> 00:06:39.200
This is another way of defining independent events.
00:06:39.200 --> 00:06:41.200
Let us look at that with this data set.
00:06:41.200 --> 00:06:53.800
What is the probability of obesity given male and is that the same with the probability of general obesity?
00:06:53.800 --> 00:06:55.300
Let us calculate that.
00:06:55.300 --> 00:07:08.300
The probability of obesity given male is 20 out of 100 but the probability of obesity over r is here.
00:07:08.300 --> 00:07:12.600
It is obesity of all the people in this sample.
00:07:12.600 --> 00:07:15.400
It is 40 out of 200.
00:07:15.400 --> 00:07:18.800
That is exactly the same proportion 20%.
00:07:18.800 --> 00:07:23.000
Here we see this.
00:07:23.000 --> 00:07:31.500
Being male or obese is independent events in this example.
00:07:31.500 --> 00:07:42.100
Now that we know how to define independent events mathematically let us talk about the multiplication rule for conditional probability.
00:07:42.100 --> 00:08:04.300
Remember those trees that we found was that if you wanted the probability of A and B that is equal to the proportion of A given B multiplied by the proportion of the probability of B.
00:08:04.300 --> 00:08:06.100
Think about these spaces.
00:08:06.100 --> 00:08:13.800
B and what proportion of that is A given B if you multiply this together you will get that raw score.
00:08:13.800 --> 00:08:18.500
This is what we call the multiplication rule for conditional probability.
00:08:18.500 --> 00:08:35.900
Out of this you could also get the definition of conditional probability where probability of A given B equals the probability of A and B over the probability of B.
00:08:35.900 --> 00:08:40.300
We already know the multiplication rule and that is just one step around.
00:08:40.300 --> 00:08:56.300
I should have to say here that obviously you could have probability of B given A × p(A) because you always want to have that entire world that you are living in.
00:08:56.300 --> 00:08:58.600
The condition that you are living in.
00:08:58.600 --> 00:09:10.000
Independent events we now have a slight change than this because the probability of A given B equals the probability of A, look at this rule again.
00:09:10.000 --> 00:09:30.400
All we have to do now is this, in order to find the probability of A and B since this equal this we can now just do probability of A × p(B).
00:09:30.400 --> 00:09:35.100
This is exactly equal to the p(A) given B.
00:09:35.100 --> 00:09:38.600
For independent events we can simplify this.
00:09:38.600 --> 00:09:42.400
This all goes back to the multiplication rule.
00:09:42.400 --> 00:09:52.900
In independent events, now you could just write p(A) and still be able to calculate p(A) and B.
00:09:52.900 --> 00:10:11.500
The other way that you could think about this is you could change it into figuring out different relationships among these things but you can also generalize it to more than just 2 events.
00:10:11.500 --> 00:10:24.500
We could put 3 events, p(ABC) = p(A × B × C).
00:10:24.500 --> 00:10:33.100
You could do 4 events, 5 independent events because you can do this infinite times.
00:10:33.100 --> 00:10:42.700
That way I like to think about this is going back to the sample spaces thinking about this independent events as slots that you could fill.
00:10:42.700 --> 00:10:46.400
Let us think about flipping a coin, those are independent events.
00:10:46.400 --> 00:10:54.100
Knowing that you first flip is a head does not do anything for my next flip of coin.
00:10:54.100 --> 00:10:56.900
It is still a 50-50 chance of getting heads.
00:10:56.900 --> 00:11:05.400
Here you could think about this as the probability of A, probability of getting heads is 50%, 50%, and 50%.
00:11:05.400 --> 00:11:08.100
You could see that it will go on and on and on.
00:11:08.100 --> 00:11:13.500
Flipping coins are classic examples of independent events.
00:11:13.500 --> 00:11:15.500
Let us move on to some examples.
00:11:15.500 --> 00:11:22.600
Here is example 1, suppose you draw a card at random from a deck of cards which of these pairs of events are independent?
00:11:22.600 --> 00:11:29.500
You are just drawing 1 card and just because they say events it does not mean you are drawing 2 cards.
00:11:29.500 --> 00:11:34.700
It just means that it is 2 different aspects of cards like heart and jack.
00:11:34.700 --> 00:11:40.900
Here it says it is getting a heart independent of getting a jack.
00:11:40.900 --> 00:11:44.500
Does having any of the one affect the probability of getting the other?
00:11:44.500 --> 00:11:51.700
We could line out the rule for independence of events.
00:11:51.700 --> 00:11:59.700
Probability of heart given jack should equal the probability of hearts.
00:11:59.700 --> 00:12:04.200
Is that true? Let us think about this.
00:12:04.200 --> 00:12:10.000
There are only 4 jacks, that is my whole universe and the probability of getting a heart is ¼.
00:12:10.000 --> 00:12:14.000
That is the probability of getting a heart overall.
00:12:14.000 --> 00:12:16.400
I would say these are independent.
00:12:16.400 --> 00:12:24.200
I chose probability of heart given jack but you could have also done it the other way around.
00:12:24.200 --> 00:12:32.500
Probability of jack given heart is that equal to the probability that you will just draw a jack.
00:12:32.500 --> 00:12:42.100
The heart world is 13 cards so out of 13 there is only 1 jack that is 1/13.
00:12:42.100 --> 00:12:50.600
The probability of drawing a jack is 4 out of 52 which is 1 out of 13.
00:12:50.600 --> 00:13:00.700
Eventually we will get out of 13 and we will see that it does not matter which event you pick as your condition are independent.
00:13:00.700 --> 00:13:05.400
Are these 2 events independent?
00:13:05.400 --> 00:13:08.000
Getting a heart or getting a red card.
00:13:08.000 --> 00:13:17.900
We could set that up again heart versus red card, heart given red card, is that the same as the probability of getting hearts overall?
00:13:17.900 --> 00:13:21.900
We already know this one, it is ¼ same as here.
00:13:21.900 --> 00:13:27.300
There is a probability of getting a heart given that you already have a red card is going to be different.
00:13:27.300 --> 00:13:33.300
Half of the cards in the deck are red, hearts and diamonds.
00:13:33.300 --> 00:13:37.100
That is 26 cards.
00:13:37.100 --> 00:13:47.800
Out of these 26 cards half of those are hearts, 13 out of 26 are hearts.
00:13:47.800 --> 00:13:53.800
That is half of those cards are hearts if you know that is a red card.
00:13:53.800 --> 00:13:56.600
½ is not equal ¼.
00:13:56.600 --> 00:13:58.900
I would say these are not independent.
00:13:58.900 --> 00:14:06.700
Here let us say independent and here not independent.
00:14:06.700 --> 00:14:10.200
You could always test it the other way as well.
00:14:10.200 --> 00:14:16.500
Probability of 1 given heart is that equal to probability of getting a red card?
00:14:16.500 --> 00:14:29.300
What about this last one, the probability of getting a 7 given heart is that equal to just getting a 7?
00:14:29.300 --> 00:14:30.800
Let us see.
00:14:30.800 --> 00:14:40.000
The probability of getting a 7 is that there a 4 7’s one for each suit out of the 52 cards.
00:14:40.000 --> 00:14:53.200
4 out of 52 and that is going to reduce to 1 out of 13 because for every suit there is only 1 7.
00:14:53.200 --> 00:14:58.800
What about probability of getting a 7 given that it is a heart?
00:14:58.800 --> 00:15:05.300
If it is a heart that is only 13 cards but the probability of getting a 7 is 1 out of 13.
00:15:05.300 --> 00:15:07.300
These are equal.
00:15:07.300 --> 00:15:13.100
Let us say independent.
00:15:13.100 --> 00:15:22.600
Here is example 2, the US department of health and human services found that 30% of young Americans 18 to 24 years old do not have health insurance.
00:15:22.600 --> 00:15:30.000
If you sampled 2 young Americans at random what is the probability that the first has insurance and the second does not?
00:15:30.000 --> 00:15:34.700
At first you might think this is sampling without replacement.
00:15:34.700 --> 00:15:44.000
You might think that this is conditional but if you are sampling from the entire US because it is just 2 young Americans
00:15:44.000 --> 00:15:57.600
at random it changes the probabilities into tiny decimal amount that it does not matter.
00:15:57.600 --> 00:16:01.400
We could treat this as almost independent event.
00:16:01.400 --> 00:16:17.900
Frequently that is one way that independence is used for almost independent events where it might affect it slightly.
00:16:17.900 --> 00:16:26.700
Think about drawing one young American what is the probability that any 1 young American would not have health insurance?
00:16:26.700 --> 00:16:30.000
That is 30%.
00:16:30.000 --> 00:16:35.400
What is the probability that drawing 1 American has health insurance?
00:16:35.400 --> 00:16:41.000
Here is the first guy, has health insurance.
00:16:41.000 --> 00:16:43.200
That will be 70%.
00:16:43.200 --> 00:16:53.500
You can multiply that by the probability of the second guy not having health insurance.
00:16:53.500 --> 00:16:57.000
That is .3 or 30%.
00:16:57.000 --> 00:17:18.700
If you multiply those together then it says 21% chance that you will get the combination that the first guy has insurance and the second no insurance.
00:17:18.700 --> 00:17:34.700
Remember we noticed because of the revised multiplication rule where we can just look at this as being equal to the probability of the first guy having insurance times
00:17:34.700 --> 00:17:44.000
the probability that the second guy has no insurance.
00:17:44.000 --> 00:17:55.400
Example 3, a state school gets 1725 applications, are being admitted and going to private school independent events?
00:17:55.400 --> 00:18:00.300
We could apply our definition of independence.
00:18:00.300 --> 00:18:12.000
Is the probability of being admitted given private school?
00:18:12.000 --> 00:18:15.800
Is that equal to the probability of just being admitted?
00:18:15.800 --> 00:18:17.800
Let us check.
00:18:17.800 --> 00:18:25.100
Here is the probability of being admitted given private schools, that is this university right here.
00:18:25.100 --> 00:18:35.700
That is going to be 220/483 that is my probability of being admitted given that it is a private school.
00:18:35.700 --> 00:18:38.100
This the probability of being admitted at all.
00:18:38.100 --> 00:18:53.800
This is 870/everybody and we want to know are these equal to each other?
00:18:53.800 --> 00:19:00.500
I’m just going to use Excel as my little calculator.
00:19:00.500 --> 00:19:16.200
220 ÷ 483 that gives us about 46% chance of getting in if you go to private school.
00:19:16.200 --> 00:19:34.700
870 / 17.25 is a slightly higher chance of getting in.
00:19:34.700 --> 00:19:40.100
This is probably not true, you have a chance of getting in if you go to private schools.
00:19:40.100 --> 00:19:45.200
That is 46% is not equal 50%.
00:19:45.200 --> 00:19:56.000
I would say that it is small but there is a slight difference between being admitted and these are not independent because
00:19:56.000 --> 00:20:05.900
there is a slight difference in the conditional probability versus the overall probability.
00:20:05.900 --> 00:20:15.900
Example 4, about 11% of college freshman have to take a remedial course in reading, suppose you take a random sample of 12 college freshman from around the US,
00:20:15.900 --> 00:20:22.100
what is the probability that none of the 12 have to take remedial reading?
00:20:22.100 --> 00:20:28.100
What is the probability that at least 1 has to take a course in remedial reading?
00:20:28.100 --> 00:20:45.400
Here we could use the multiplication rule because we could assume almost independence in picking 12 people, it is almost like sampling with replacement.
00:20:45.400 --> 00:20:47.900
It is not going to affect the probability that much.
00:20:47.900 --> 00:21:14.200
What is the probability that the first guy does not take remedial reading and you want to multiply that by the probability that the second kid does not take remedial reading,
00:21:14.200 --> 00:21:22.200
all the way up to the probability that the 12 kid does not have to take remedial reading.
00:21:22.200 --> 00:21:33.200
It is not 11%, if you draw a percent random there is 11% chance that this college freshman has to take remedial reading.
00:21:33.200 --> 00:21:38.600
The flip side of that not having to take it is 89%.
00:21:38.600 --> 00:21:52.900
That would be 89 × .89 × .89, 12 times .89⁺12.
00:21:52.900 --> 00:22:09.900
That would be .89⁺12, 24.7%.
00:22:09.900 --> 00:22:25.400
That 25% of students of this sample, if we took a group of 12 people, 25% of the time all 12 do not have to take remedial reading.
00:22:25.400 --> 00:22:32.200
Notice the probability that at least 1 have to take a remedial course.
00:22:32.200 --> 00:22:43.700
We should not apply this rule because we do not know which one of these guys takes the remedial course.
00:22:43.700 --> 00:22:45.000
We do not care which one.
00:22:45.000 --> 00:22:57.600
We do not care if it is the first or second, or the first and third, or the first, second, third, or all of them.
00:22:57.600 --> 00:23:03.800
Except for the last guy that do not have to take.
00:23:03.800 --> 00:23:11.900
We just want to know, what is the probability that at least 1 will have to take remedial course?
00:23:11.900 --> 00:23:13.600
That is every combination.
00:23:13.600 --> 00:23:18.600
1, 2, 3 all the way up to 11.
00:23:18.600 --> 00:23:27.200
The only case that you want to leave out is when all 12 do not have to take a remedial course.
00:23:27.200 --> 00:23:44.100
What we could do is 1 – the probability that all are exempt from remedial reading.
00:23:44.100 --> 00:23:51.000
We already know that it is 1 - .247.
00:23:51.000 --> 00:23:58.400
That should give us .753.
00:23:58.400 --> 00:24:06.400
That should give us about 75% of samples of 12.
00:24:06.400 --> 00:24:11.900
The samples are at least 1 where they have to take a remedial reading course.
00:24:11.900 --> 00:24:22.100
That is our shortcut.
00:24:22.100 --> 00:24:27.000
That is it for independent events, thanks for using www.educator.com.