Start learning today, and be successful in your academic & professional career. Start Today!

• ## Transcription

### Bayes' Rule

Lecture Slides are screen-captured images of important points in the lecture. Students can download and print out these lecture slide images to do practice problems as well as take notes while watching the lecture.

• Intro 0:00
• When to Use Bayes' Rule 0:08
• When to Use Bayes' Rule: Disjoint Union of Events
• Bayes' Rule for Two Choices 2:50
• Bayes' Rule for Two Choices
• Bayes' Rule for Multiple Choices 5:03
• Bayes' Rule for Multiple Choices
• Example I: What is the Chance that She is Diabetic? 6:55
• Example I: Setting up the Events
• Example I: Solution
• Example II: What is the chance that It Belongs to a Woman? 19:28
• Example II: Setting up the Events
• Example II: Solution
• Example III: What is the Probability that She is a Democrat? 27:31
• Example III: Setting up the Events
• Example III: Solution
• Example IV: What is the chance that the Fruit is an Apple? 39:11
• Example IV: Setting up the Events
• Example IV: Solution
• Example V: What is the Probability that the Oldest Child is a Girl? 51:16
• Example V: Setting up the Events
• Example V: Solution

### Transcription: Bayes' Rule

Hi, this is the probability lectures here on www.educator.com.0000

My name is Will Murray and today, we are going to talk about Bayes’ rule.0004

Bayes’ rule is one of the most interesting rules in probability but it is also the easiest to make mistakes with.0009

You get these very counterintuitive results with base rule0016

and you can get some very tricky problems that can really mess with your head.0021

It is worth being very careful when you are dealing with Bayes’ rule.0027

And it is also worth following the formula very carefully.0031

If you follow the formula, you would not go wrong but you can also get some very strange stuff, if you are not careful.0035

With that said, let me show you when you use Bayes’ rule.0042

What kind of problem you apply it for.0047

The idea is that your sample space, the set of all possible outcomes must be a disjoint union of events.0050

Let me show you what that would look like.0057

All the thing possible things that can happen have to be divided up into disjoint unions.0062

You have to be covering all possibilities and they cannot overlap at all.0073

Something like this, where you have an event B1, B2, and so on, up to Bn.0078

You have a disjoint union, very common example of Bayes’ rule is when you have a group of people0086

and some of them are women and some of them are men.0092

That is a disjoint union, something like that.0096

It could be a common example where you are going to see Bayes’ rule coming into play.0098

After you have a disjoint union, you got one more event that overlaps all of them.0104

You got one more event and I’m going to call it A, like this.0110

This sort of overlaps into these different categories, this is A.0117

The way Bayes’ rule is phrased or kind of question that you are going to get that0122

for which Bayes’ rule gives you an answer is you know that A occur.0128

You are given that A is true and then what is the probability that you are in one of the other of these B boxes.0134

That is the kind of problem that you will see Bayes’ rule being applied to.0143

I think after you work through some examples with me, you get the hang of it and0147

you will get a feel for that this is the kind of problem that is asking me for Bayes’ rule.0151

It always says, given that something happen, what is the probability that we are in one of these boxes?0155

Let me give you the actual formula that you can use to solve Bayes’ rule problems and0162

then we will see how it plays out in some examples.0166

Let me start out wit Bayes’ rule for two choices and then I will move on to the general rule for N choices.0171

Even the two choices, it looks a little daunting.0177

The idea here is that we got our sample space divided into 2 disjoint events, which I will call B1 and B2.0180

For example, it might be a man and women, and a group of people, B1 and B2.0191

We got this overlapping event that kind of laps over both of them.0198

We want a problems, we are going to study later on, we are going to have tennis players at a college.0202

There are some of that everybody at the college is either a man or woman and0208

some of the men play tennis and some of the women play tennis.0212

We will have this event that overlaps into both categories.0217

And then the question is, if you know that that event is true.0221

In the problem later on, if you know that a person plays tennis,0226

what is the probability that you are in one of the other of the categories?0230

I just solved it for B1 here.0234

The probability of B1 given A.0237

You have this fairly complicated formula to calculate it.0241

What it does is it reverses the roles of the A and B1.0245

We see we got B1 given A here and then you reverse that to have A given B1 × the probability of B1.0249

In the bottom, we have the same thing again, probability of A given B1 × the probability of B1.0257

And the same thing would be too, the probability of A given B2 × the probability of B2.0263

That is a fairly complicated formula.0270

After we do some examples, I think you will get more used to it.0273

Bear with me, I’m going to give you the formula for N possible choices.0276

Instead of B1 and B2, we will have B1 through BN.0280

I will give you that formula and then we will jump into some examples.0285

But if you are already feeling overwhelmed, if you want to try an example right away,0288

you might want to go ahead and skip to example 1 because that is one where you have two choices.0292

That is where we will use this formula right away.0298

Let me show you Bayes’ rule for multiple choices.0304

It is the same general rule as Bayes’ rule for two choices but it applies to a more complicated setting here.0306

We have a big sample space and it is divided up as a disjoint union of several different possibilities.0315

We are calling the different possibilities B1, B2, and so on, up to Bn.0326

The idea there is that at some point, you are interested in the probability of one of these choices which I’m going to call B sub J.0336

What you are given is some information about one more event which will the overlap all of these choices.0348

There is an A that overlaps all of these choices.0356

You are told that A is true.0360

You are given that A is true, this is conditional probability here.0363

The probability of B sub J given A is equal to, this is the same as the Bayes’ rule for two choices in the numerator.0367

We switched the rules of the B sub J and A.0375

The probability of A given B sub J × the probability of B sub J.0378

And then in the denominator, we have the same kind of thing except we are adding up all those possibilities over all those events.0384

The probability of A given B sub I × the probability of B sub I.0392

You add that up over all the I's from 1 to N.0398

Let us check that out in the context of some examples.0402

The first one is just only two possibilities but I think it is already fairly challenging.0405

We will get the hang of it, probably out after we do several examples together.0411

Example 1, it is a problem having to do with a home test for diabetes.0416

Imagine, you buy this little kit at the pharmacy and you take it home and you test yourself for diabetes.0422

The setup here is that among all the people who take these tests, 20% of them actually are diabetic.0431

Maybe the people who take these has some reason to suspect that they might be diabetic.0437

20% of them actually are diabetic.0442

Now, these tests are not 100% accurate.0444

Sometimes they have false positives and sometimes they have false negatives.0446

The way it works is, if the person taking the test, if you are diabetic then there is a 90% chance that the test will show positive.0451

If you are not diabetic then there is an 80% chance that the test will show negative.0460

It looks like the test is sort of 90% accurate, if you diabetic.0467

It is 80% accurate, if you are not diabetic.0470

The idea here is that a woman takes the test and it shows positive.0474

It shows positive and then the question is what is the chance that she actually is diabetic?0480

It is a fairly complicated problem.0487

Let me try to graph out the possibilities here, I’m going to label some events.0489

Our disjoint events are the fact that when you take the test, you either are diabetic or not diabetic.0494

I'm going to call those the B1 and B2.0505

B1 is the event that you are diabetic, that you have the disease.0507

B2 is the event that you are not diabetic.0515

What we are given in this scenario is that a woman takes the test and it shows positive.0524

The test came a positive for this particular woman.0531

And that is what I'm going to call the event A, is that the test is positive.0534

Let me draw out how these events all fit together.0545

Because as I said, it is a little complicated.0547

Here is our total sample space and there are two things that can happen.0553

Either the woman is diabetic or not diabetic.0562

That gives us a disjoint union.0565

B1 is diabetic and B2 is that she is not diabetic.0567

She takes this test, we do not know which of the category she is in.0584

She takes the test and she shows up positive.0592

The test registers that she is positive.0597

The question is given the she shows positive on the test, what is the probability that she is actually diabetic.0600

In other words, we want to find the probability that she is actually diabetic.0609

The probability that B1 is true given that the test shows her as being positive.0618

We want to find the probability of B1 given A.0624

That is a classic Bayes’ rule scenario.0628

Once we phrased it in this form, it is fairly easy to apply the formula.0631

I will do that on the next slide.0636

Let me make sure that you understand the setup, before we move on to the next slide and apply the formula.0638

I think the setup is the trickiest part of this formula.0644

The idea here is we are setting up two events, diabetic and not diabetic.0649

Those gives us a disjoint union of all the people in the world.0655

Everybody in the world is either diabetic or not diabetic.0658

At least according to this simplified scenario.0661

We have this event that kind of overlaps both of them.0666

The test is positive, we are given that the woman has taken the test and the test is registered positive.0668

because the test has registered positive, she could still be diabetic or not diabetic.0676

I want to find the probability that she actually is diabetic given that the test has registers positive.0683

we are going to calculate that using Bayes’ rule on the next slide.0689

Let me expand this formula on this side.0693

We want to find the probability that she is diabetic given that the test has registered positive.0697

I'm just going to expand that out using Bayes’ rule for two events.0706

I gave you this formula a couple slides ago.0710

You can check out this formula, if it does not look familiar.0715

It is the probability, we will switch the rules of A and B1.0718

A given B1 × the probability of B1 divided by, the denominator is the same thing,0722

A given B1 × the probability of B1 + the same thing for B2, the probability of A given B2 × the probability of B2.0731

Let us try to fill in all of these probabilities because we can read them off from the stem of the problem.0749

If you remember, first of all, let me remind you of what these are.0756

B1 is the event that somebody is diabetic.0763

B2 is the event that somebody is not diabetic.0769

A is the event that somebody who takes the test shows up positive.0777

And we were told in the stem of the problem, you can go back and check,0783

that 20% of the people who take this test actually are diabetic.0789

That is the probability of B1, that is 20%.0797

I will just write that as 0.2.0799

You can fill that in 0.2.0803

That means 80% of them are not diabetics.0807

I will fill in a 0.8 there.0809

We are also given something about the accuracy of the test.0812

The test is 90% accurate, when a person is diabetic.0818

That means that if the person is diabetic, the probability that the test will be positive is 90%.0824

That tells us the probability of A given B1.0833

Given that the person is diabetic, the probability that the test will show up positive is 90%.0836

I can put that in for A given B1.0841

The last part is a little confusing here because we want to find the probability of A given B2.0847

That is the probability that the test is positive given that the person is not diabetic.0855

What the problem told us was that if the person is not diabetic, the test is accurate 80% of the time.0862

What that means is the test will show negative because that is the accurate result for a non diabetic person.0872

The test will show negative 80% of the time.0879

That means the test will show positive 20% of the time.0883

The probability of A given B2 is 20 percent.0888

That is really because the probability that the test is positive given that the subject is a not diabetic.0893

Given that the subject is not diabetic, that is the trickiest one there.0908

Remember, when we said if it is not diabetic, the test is 80% going to show negative.0913

That means it is 20% going to show positive.0918

That is filling in all the numbers, that is the hard part.0922

Now, it is just a matter of simplifying all these numbers down.0925

I’m going to write them on the next line, just to get rid of all the jargon.0929

0.9 × 0.20/0.9 × 0.20 + 0.8 × 0.20.0935

I will switch the roles there.0946

I see I got a 0.2 multiplied by everything and I can just cancel out the 0.2 from everything,0947

dividing by 0.2 from the top and bottom.0953

I still see that I got these decimals everywhere.0955

You will multiply by 10/10 and I will just get rid of some of my decimals.0958

That will give you 9/9 + 8.0962

That is pretty easy to figure out, it is 9/17.0966

That is our final answer, that is the probability that a woman actually has diabetes if this test shows that she is positive,0972

if the test shows positive for diabetes.0984

That is a little surprising because if you think about it 9/17 is pretty close to ½.0986

We said that this test seems to be accurate, 80% of the time if she is not diabetic.0994

It is accurate 90% of the time if she is diabetic.1000

And yet, when she gets a positive reading, it is only about a ½ chance that she does in fact have diabetes.1003

That is a little surprising and that is really why Bayes’ rule can lead you into some of these very counterintuitive results.1011

Essentially, that is a result of the fact that not many of the subjects in the world are diabetic.1021

It skews the numbers to one side but it is a little surprising because1028

if you think the test is 80% accurate on one side of the ledger and it is 90% accurate on another side of the ledger,1032

how come it turns out to be accurate about ½ the time, when it shows a positive result.1041

That is the way it works out and that is why you really have to trust the math in Bayes’ rule1050

and the intuition is sometimes completely wrong.1055

Let me just remind you where we got everything here.1061

I just wrote the formula for Bayes’ rule here for two events.1064

When we have a union of two events, that is the formula for Bayes’ rule.1070

I got that from the second slide of this lecture series.1073

You can just play it back and to find that formula earlier on in the video.1078

And then I was filling in all these probabilities.1084

B1 and B2 are the probabilities that someone is diabetic and not diabetic.1088

For this population, that is a 0.2 and 0.8.1093

That is why I got those numbers.1096

The 0.9 comes from the probability that the test will be positive given that the person is diabetic.1099

That came from the problem stem.1107

It said that if a person is diabetic then there is a 90% chance that the test will show positive.1109

This 0.2 is the most confusing part, it says that what is the probability of A given B2,1115

the probability that the test shows positive given that the person is not diabetic.1121

That is the probability that the test is inaccurate.1127

We said that the person is not diabetic, the test has an 80% chance of being accurate.1132

It has a 20% chance of being inaccurate.1138

That is where that 0.2 comes from.1141

Once you drop all the numbers in, it is a pretty simple matter to simplify the fractions down to 9/17,1143

which of course is approximately ½.1151

You get this surprisingly low number, when you look at all the numbers in the original problem.1154

That is Bayes’ rule for two outcomes there.1161

Let us go ahead and try another example.1166

In this example, we are visiting a small college and it has 220 women and 115 men.1169

Tennis seems to be very popular at this college.1176

40% of the women play tennis and 30% of the men play tennis.1178

The idea here is that you find an extra tennis racket.1183

You know it belongs to somebody who left a tennis racket lying around.1186

It must be one of these tennis players.1190

You want to find out what is the chance that this tennis racket belongs to a woman?1193

Let me setup some events here and we will use Bayes’ rule to calculate things out.1199

I'm going to say A is the event that we have a tennis player because that is the event that we are given.1207

We are given that we found a tennis racket.1219

We know somebody plays tennis.1221

B1 and B2 are going to be the sets of women and men respectively.1224

And those give us a disjoint union of all the students at this college.1234

Let me draw a map of all the students at this college.1242

You are all the students at this college and I guess there are few more women than men.1248

I will try to draw this somewhat to scale.1253

There is B1 which is women and there is B2, the set of all the men at this college.1258

Overlapping both of those is the set of tennis players at this college.1272

The woman and man of course do not overlap each other but the tennis players overlap both1279

because some of the women play tennis and some of the men play tennis.1286

Here is the set of tennis players and what we are given is that we have found the tennis rackets.1292

We found something belonging to a tennis player and we have to figure out what is the chance that it belongs to a woman.1299

We are trying to find the probability that we are looking for a woman given that the person we are looking for is a tennis player.1306

That is the probability of B1 given A.1318

We have a nice formula for that, straight from Bayes’ rule.1321

I'm just going to copy the formula directly from the second slide of this lecture.1324

Remember, that you switched the A and B1 and then you multiply that × the probability of B1.1332

You divide that by the same thing P of A given B1 × P of B1.1341

Then, you add on the same thing with P of B2.1349

We want to fill in all those probabilities.1358

All those probabilities had been given to us in the problem.1362

If we fill them in right, this will be fairly fast.1365

The probability of A given B1, that means the probability that somebody plays tennis given that they are a woman.1368

We find that right here, 40% of the women play tennis, that is 0.4.1376

What is the probability that someone is a woman?1382

Here, we have 220 women and 115 men, there are as 370 students total at this college and 220 of them are women.1387

The probability of getting a woman is 22 out of 37.1403

It is actually 220 out of 370 but I’m just simplifying that down.1408

In the denominator here, the first two terms are the same.1412

0.4 × 22 out of 37.1416

Now, the probability of A given B2 of the probability that we have a tennis player given that we have a man.1421

Only 30% of the men at this college play tennis, 0.3.1428

If you have a man, there is a 30% chance that he is a tennis player.1436

The probability that you have a man in the first place is 150/370.1440

I will just go ahead and reduce that to 15/37.1445

That is probably the trickiest part now.1450

We just have to simplify these fractions here.1451

I can multiply by 37/37 and I will simplify things a bit.1457

We get 0.4 × 22/0.4 × 22 × 22 + 0.3 × 15.1462

I have canceled all my denominators of 37.1479

And this is not one I think that simplifies in any terribly nice way.1482

I just went ahead and calculated it out to a decimal.1488

What I calculated was 0.662 is my probability that this tennis racket that I found, it belongs to a woman.1493

About a 66% chance that when we go to the lost and found, that will be a woman coming to claim that tennis racket.1507

Let me remind you the setup here.1517

I think the hardest part about Bayes’ rule problems is kinda setting everything up, identifying the events,1520

and getting yourself ready to go with that formula.1526

Once you get the formula, you fill in all the probabilities and then you reduce the fractions.1530

The setup here is to, first of all divide our world into women and men.1536

We got two disjoint events that cover the whole world.1541

We got women here and men.1544

And then, we have this third event, the group of tennis players.1547

This overlap into women and men.1555

There are some women tennis players and there are some men tennis players.1557

We set up our events here and the problem is to tell us given that we have a tennis player,1561

because we found a tennis racket and it must belong to a tennis player.1568

Given that we have a tennis player, what is the probability that it belongs to a woman?1571

What is the probability of B1 given that we are in A?1576

Once you identify it that way, it is fairly easy after that to expand it out using Bayes’ rule1580

because now you are just putting a formula.1587

That expands out into this formula, I just copy that straight off the second slide of this lecture.1590

Then, I filled in all the probabilities.1598

The probability of A given B1 means that if you are a woman, what is the probability that you play tennis?1601

That comes from this 40% right here.1607

That is where that 0.4 came from and that 0.4.1610

Then, probability that a man plays tennis is 0.3.1614

Just the overall probabilities of B1 and B2, that comes from the total number of women and men1620

at the university not worrying about tennis right now is 220/370 for women and 15/370 for men.1628

I just simplify down the calculations and I came up with the decimal which was equivalent1638

to about 66% chance that this tennis racket must belong to a woman.1644

In example 3 here, we got the state of New Alkenya.1653

We are looking at the voting trends in this state.1659

Apparently, 40% of the voters there are democrats and 50% are republicans, and the leftover 10% are all freemasons.1662

This new party of freemasons.1670

We are surveying how many people support freeway lanes for pogo sticks?1673

How many supports giving us a dedicated lane on the freeway for pogo sticks?1680

24% of the democrats are in favor of that idea.1684

30% of republicans think that is a good idea.1687

50% of the freemasons are in favor of that.1690

What we found is that a voter, we have done a telephone survey.1694

We grab a voter at random.1698

It turns out that this voter is strongly in favor of the freeway lanes for the pogo sticks.1699

The question is what is the probability that this particular voter is a democrat?1704

Let me setup some events here.1710

We are going to divide the world now into democrats and republicans, and freemasons.1714

Those are going to be, my B1 and B2 and B3.1722

B1, I will call it BD is the set of democrats.1730

B2 will be the set of republicans, I will call that BR, republicans.1742

B3 will be the set of freemasons, I will call that B sub F.1749

The event that overlaps into all three of those categories is the fact that at random,1759

likes the idea of dedicated freeway lanes for pogo sticks.1769

Event A would be a pogo stick supporter there.1776

Let me indicate how these events all fit together.1781

Let me draw you a map of all of these voters.1786

There is no overlap between democrats and republicans and freemasons.1790

Those really are three completely separate categories there.1796

I put the democrats on the left.1803

It is appropriate to their political persuasion.1807

There is B, the democrats.1810

I will put the republicans over here, the republicans.1813

And then we have the freemasons.1820

Overlapping all three of those is the set of people who support freeway lanes for pogo sticks.1823

There is the set of people who like freeway lanes for pogo sticks and some democrats like that idea and1836

some republicans think it is a good idea and some freemasons think it is a good idea.1842

What we are really trying to calculate here is the probability that a random voter is a democrat.1847

The probability of BD, but we are given that that person supports the freeway lanes for the pogo sticks.1856

Given that A is true, that is what we want to calculate.1865

I want to expand that out on the next page.1872

We are going to use our formula for Bayes’ rule but1875

On the next page it is just expanding a formula and then doing some arithmetic.1883

It is identifying the three events here, democrats, republicans, and freemasons.1890

Those give us a disjoint union of the space of all voters.1894

And then, we have this one overlapping event which is that a voter could support freeway lanes for pogo sticks.1898

That one event which we are calling A, overlaps all three of those.1906

We are given that we found the voter who does support those lanes.1912

The question is, what is the probability that our voter is a democrat?1916

We are going to expand this out on the next page using the Bayes’ rule for multiple events.1921

Let me remind you what we are trying to calculate.1933

We are trying to calculate the probability that a voter is a democrat given that she supports freeway lanes for pogo sticks.1936

I'm going to write-down the Bayes’ rule formula which was that, let me write that generically over here in the corner.1946

The probability of b sub J given A is the numerator, you switch them.1954

Probability of A given B sub J × the probability of B sub J.1963

I’m just copying this from that third slide back in the game.1968

You can see where this formula comes from.1973

Divided by the sum from, as I goes from 1 to N, the same things probability of A given B sub I × the probability of B sub I.1975

We are going to expand that out.1997

Here, we have three events.1998

We have someone being a democrat, we have someone being a republican,2000

and the probability of someone being a freemason.2007

And then of course, we have the pogo sticks overlapping all of them, that is the A.2011

A is the pogo stick crowd.2016

But according to our formula here, in the numerator, we have the probability and2019

we switched them A given B sub D × the probability of B sub D.2030

The D is like the J in our formula before, B sub D.2038

Our numerator is going to be quite long because we are adding up three different things.2043

The probability of A given B sub D × the probability of B sub D + the probability of A given B sub R ×2047

the probability of B sub R + the probability of A given B sub F × the probability of B sub F.2064

That is probably the hard work done now.2078

We are just going to fill in all of these probabilities.2081

The probability of A given B sub B that means if somebody is a democrat, what is the probability that they support the pogo sticks.2086

From the original stem of the problem, we were given that if somebody is a democrat2098

then there is a 20% chance, I will put 0.20 that they support the lanes.2108

The probability of being a democrat in the first place is 40%, 0.4 is the probability of B sub D.2114

The first set of terms on the denominator is just the same, 0.2 × 0.4.2123

Now, the probability that a republicans supports these freeway lanes is, let us see.2130

The survey said that 30% of the republican support these pogo sticks.2138

The probability of being a republican is 0.5, in the first place.2144

That is because 50% of the voters are republicans, in the first place.2149

The probability that a freemasons supports the lanes is 0.5.2154

Given that you are a freemason, there is a 50% chance that you support the lanes.2158

The probability that you are a freemason in the first place is 0.1 because 10% of the voters are freemasons.2162

It is a fairly quick process to simplify this.2171

In fact, I think I see I have a lot of decimals.2177

You know what I'm going to do is multiply top and bottom by 100 and that will get rid of all my decimals for me,2180

because I have decimals everywhere here.2188

I have 2 × 4, that is not 2.4 that is 2 × 4.2190

2 × 4 + 3 × 5 + 5 × 1.2195

That is going to simplify pretty easily now.2204

That gives me 8/8 + 15 + 5 which is 8/28, which is 2/7.2205

That is our probability that a random pogo stick supporting voter is a democrat.2220

Let me remind you how we got to each of the steps there.2230

First of all, I was recalling the Bayes’ rule formula for multiple events.2234

This formula is copied straight off the third slide of this lecture, if you go back and check, you will find it.2238

That is the Bayes’ rule formula for multiple events.2245

In this case, my disjoint events are being a democrat, being a republican, and being a freemason.2248

The one that overlaps them all is supporting these pogo sticks.2256

That was our A here.2259

The question is what is the chance that you are a democrat given that you support these pogo sticks?2261

According to Bayes’ rule, we can expand that out and we sort of switch the A and B sub D here × the probability of B sub D.2267

We have this big sum in the denominator.2280

A given B sub D × B sub D, A given B sub R × B sub R, and A given B sub F × B sub F.2283

If we plug in all the probabilities, I got all of these probabilities from the stem of the problem.2293

It tells me, first of all that 40% of people are democrats, 50% of people are republicans and 10% of people are freemasons.2298

It also told me that given that you are a democrat, there is a 20% chance that you support the lanes, the pogo sticks.2309

Given that you are a republican, there is a 30% chance that you support them.2317

Given that you are a freemason, there is a 50% chance that you support them.2321

And then, I just did a little arithmetic.2326

I did not like these decimal so I multiply top and bottom by 100 which gave me a 10 here to turn 0.21 to 2.2328

A 10 to spare here to turn that 0.4 to a 4.2336

I got 2 × 4, 2 × 4, 3 × 5, 5 × 1.2339

That simplify down nicely into the fraction 2/7.2343

Example 4 is a real classic of probability.2354

This is a famous paradox in probability.2357

It is one that a lot of people get very confused about.2361

It is going to be fun to work through it.2365

It is a Bayes’ rule problem, the idea is that we have three bags.2368

One bag has 2 apples, one with 2 oranges, and one with an apple and an orange.2372

Of course, these bags are opaque, you cannot see into the bags and you cannot tell which fruit are in which bag.2378

You pick one bag at random and you draw out one fruit.2384

You look at the fruit and it turns out to be an apple.2388

The question is, what is the chance that the other fruit in that same bag is also an apple?2391

In other words, what is the chance that you got the bag with two apples?2398

A lot of people think that the answer to this problem is ½ because you are obviously in the 2 apple bag or the apple orange bag.2401

You think that it got to be one of the other, it must be ½.2410

It turns out that that answer is not correct.2413

This problem is a lot more subtle than that.2416

What we want do is analyze this using Bayes’ rule.2421

Let me set up some events here.2425

My three events are going to be the three bags.2428

My B1 will be the apple-apple bag.2432

That represents the bag with 2 apples.2436

My B2 will be the apple-orange bag.2439

My B3 will be the bag with 2 oranges.2444

My overlapping event, the event that I'm given is true is the event A which is that we draw an apple.2449

We are given that we drew an apple and then the question is what is the probability that we are in the 2 apple bag?2461

Let me just make a diagram of how everything here fits together.2469

We got these three bags here.2479

The three bags do not overlap each other.2484

There is the apple-apple bag.2487

There is the apple-orange bag.2490

There is the orange-orange bag.2493

We have this other event that overlaps them all.2498

Actually, this is not quite drawn accurately because this is the event that we drew an apple.2504

If we drew an apple then we could not have been drawing from the orange-orange bag.2514

This region right here is actually does not exist.2519

It does not overlap into orange-orange category here.2524

But that is okay, the calculations that we do will take that into account.2528

We do not really need to worry about that.2533

We do not need to make a special case for that.2535

That will be built into the calculations later on.2537

What we are really trying to find out is the probability that assuming that we drew an apple,2540

what is the chance that the other fruit in the same bag is an apple?2547

That means that we must have been drawing from the bag that has 2 apples.2551

What we are asking that, what is the chance, let me make that a little more clear.2559

The probability that we are in the apple-apple bag given that we drew an apple as the first fruit.2564

That is what we are trying to calculate.2573

I'm going to expand this out using Bayes’ rule on the next slide.2576

Let me just remind you what these events are.2583

We got 2 apples, we got an apple-orange bag, we got a 2 orange bag.2587

Those fit together as a disjoint union to cover all the possible outcomes.2602

We know we are going to draw one fruit.2607

It is going to come from one of the bags and bags do not overlap.2609

There is this event that we are given that we draw an apple.2613

That does overlap into the different categories.2617

And we are given that that is true and we want to find the probability that2620

we are in the apple-apple bag given that we drew an apple.2624

Let us calculate that out.2630

I’m going to use the Bayes’ rule formula for multiple possibilities.2632

That is the formula that I gave you on the third slide of this lecture.2638

You can go back in the video and you can find that.2643

I will remind you what the formula is now.2646

The probability of B sub J given A is equal to, you flip the probability of A given B sub J ×2648

the probability of B sub J divided by the sum from I equals 1 to N of the probability of A given B sub I2659

× the probability of B sub I.2681

And that is the formula we are going to be using.2684

What we are trying to calculate in this particular problem is the probability that2687

we are in the apple-apple bag given that we drew an apple initially.2694

I'm going to expand that out using the Bayes’ formula for multiple events.2700

It is according to the formula, you switch and you have the probability of the apple given B sub AA ×2709

the probability of B sub AA.2717

The denominator is going to be a little long and messy because we have three different events we need to calculate here.2722

The probability of the apple given B sub AA × the probability of B sub AA +2730

the probability of the apple given B sub AO, that was our second event there.2739

× the probability b sub AO.2746

And one more event here is the probability of the apple given B sub OO × the probability of B sub OO.2749

I can fill in a lot of these probabilities.2771

Remember, we are going to pick one of these bags at random.2775

Let me fill in those probabilities first.2778

The probability that we are in the 2 apple bag is 1/3.2780

The probability here that we are in the 2 apple bag is 1/3.2783

The probability that we are in the apple-orange bag is 1/3.2787

The probability that we are in the orange-orange bag is 1/3.2789

We are calculating those before we know that we drew out an apple.2792

That is why those are each 1/3 because we do not yet have any information that we drew out an apple.2796

But now, let us calculate these other conditional probabilities.2802

If we are in the apple-apple bag, what is the probability that we would get an apple?2806

If we know we are in the bag with 2 apples, then the probability that we get an apple is 100% or 1.2811

The probability of apple given 2 apples is 1.2818

The probability of an apple given that we are in the apple-orange bag.2822

For the apple-orange bag, what is the probability of getting apple?2828

There are 2 fruits in there, one of them is an orange.2832

It is a ½.2836

The probability of getting an apple given that we are in the orange-orange bag is 0.2838

Because if you are in the orange-orange bag, there is no way you can get an apple.2844

When you combine all these numbers together, I think that is the hard part already done now.2849

You just have to do a little arithmetic.2853

We get 1 × 1/3 in the top.2855

In the bottom, we get 1 × 1/3 + ½ × 1/3 + 0.2859

That 0 drop right out.2869

You know I got a 1/3 everywhere.2872

If I multiply the top and bottom × 3 then I will get a 1/1 + ½ which is 1/ 3/2.2874

If I can fit that around, I get 2/3.2885

That is a little bit surprising because you remember, when we first looked at that problem,2891

we said if we got an apple, we know we must be in the apple-apple bag or we must be in the orange-orange bag.2898

You would think at that point, it is a 50-50 chance that you are in the 2 apple bag or the apple-orange bag.2903

But in fact, that is not true.2912

Essentially, if you got an apple it is telling you that that is more likely to have occurred2914

by pulling it out of the apple-apple bag.2920

Bayes’ rule kind of gets you out of a very sticky and counterintuitive situation.2924

Intuition might tell you that there is a 50-50 chance but the calculations actually tell you, and they are correct,2930

that there is a 2/3 chance that you were in the apple-apple bag.2937

There is a 2/3 chance of the other fruit in the bag is indeed an apple.2941

Let me remind you how that went.2947

We were calculating from the previous page that we are assuming that we have an apple and2950

we want to find the probability that we are in an apple-apple bag.2957

I use the Bayes’ rule formula for multiple events.2963

I got that from one of the earlier slides in the lecture.2967

It says that the probability of any particular event given this overlapping event is the probability,2971

the conditional probability we kind of switch them.2980

A given B sub J × the probability of B sub J and then we moved the denominator.2982

You add up all the events, A given B sub I × the probability of B sub I.2988

In this case, our events were the B sub 1 was the apple-apple bag.2994

The B sub 2 was the apple-orange bag.3002

B sub 3 was the orange-orange bag.3007

That is how I expanded out the denominator there.3011

The apple-apple, apple-orange, and orange-orange.3015

The numerator is the one we are interested in, that is the apple-apple.3020

We expand all that big formula out, then we fill in all the probabilities.3025

The probability of each bag without knowing that we have an apple is just 1/3.3031

That is where I got all of those numbers being 1/3.3036

And then the probabilities of getting an apple added each bag is a little trickier.3040

If you are in the apple-apple bag, you are guaranteed to get an apple.3045

That is a 1.3049

That is where that one come from as well.3052

If you are in the apple-orange bag, there is a ½ chance that you get an apple.3054

If you are in the orange-orange bag, there is a 0 chance that you get an apple.3058

That is where that 0 comes from.3062

It is just a matter of collecting all the fractions together and doing a little arithmetic to simplify it down to 2/3.3064

In our last example for Bayes’ rule here, we got a family that has 3 children and we are given that there are no twins.3078

We are given that there are at least 2 girls and we want to find the probability that the oldest child of those 3 children is a girl.3086

Again, let me setup some events here.3096

We are asked the probability that the oldest child is a girl.3101

Let me set up an event B1 is the event that the oldest child is a girl.3104

And that oldest child could also be a boy.3118

I will call that B2 is a boy.3124

Overlapping both of those is the event that we are given.3129

We are given that there are at least 2 girls.3133

My event A will be that there are at least 2 girls.3136

Let me draw a map of this experiment.3144

This B1 and B2 are disjoint events because that oldest child is either a boy or a girl.3155

There is B1 and there is B2.3164

But overlapping both of those is the possibility that at least 2 of them could be a girl.3168

That is my A.3176

And we are given that A is true and we want to find the probability that the oldest child is a girl.3180

We want to find the probability of B1 given that A is true, given that at least 2 of the children are girls.3187

We have a formula for that, that we have two events here.3201

This is really using Bayes’ rule for two events.3203

Let me expand out that formula.3206

According to the second slide of this lecture, you can just play it back and find it, if you do not remember it.3209

It is P of A given B1.3215

I will switch those around, × P of B1.3220

In the denominator is that same expression.3225

Let me copy that again.3228

P of A given B1 × P of B1 + the same thing with B2.3230

A given B2 × the probability of B2.3240

We have to figure out all those probabilities and drop them into the formula and calculate it out.3248

The probability of B1 is fairly easy.3254

That is the probability of the oldest child is a girl, that is ½.3257

We can fill that in.3262

The probability that the oldest child is a boy is also ½.3263

That is at least if you have no extra information about that.3267

The probability of A given B1 and A given B2 is a little trickier.3273

We calculate those down here.3280

The probability of A given B1 that means that we are given that the oldest child is a girl.3283

We are given that the first child is a girl.3292

We want to figure out the probability that at least 2 children are girls.3299

The probability that at least 2 children are girls, if the first child is a girl means the probability that3304

at least 1 of the younger 2 children is a girl.3313

We got 2 younger children here.3330

The 2 younger children could be a girl – boy, girl – girl, it could be boy – girl.3332

It could be boy – boy.3342

At least one of them being a girl, there is 3 of those situations where at least 1 of them is a girl, that is ¾.3347

The probability of A given B2, we need to fill that in.3357

That is the probability of at least 2 girls given that the oldest child is a boy.3363

If the oldest child is a boy, then to get at least 2 girls, you got to have both of the younger 2 being girls.3368

Both of younger 2 must be girls.3378

If you look at the same listing above, there is 1 out of 4 situations.3395

There gives you both of the younger 2 being girls, that is 1 out of 4.3400

I can fill in these numbers here.3404

Probability of A given B1 was ¾.3407

This probability is 1 out of 4.3411

That was also ¾, just like below.3414

I can just write this in terms of numbers now.3419

¾ × ½/ ¾ × ½ + ¼ × ½.3423

I see I got a ½ everywhere.3434

I can erase those if I multiply top and bottom by 2.3436

I know that I got ¼ everywhere.3440

If I multiply top and bottom by 4 as well, that will cancel all my 4th.3445

In the top, that will just leave me with a 3.3449

In the bottom, that will leave me with 3 + 1.3452

I have getting rid of all my denominators there.3456

I get 3 out of 4 and that is my final probability.3459

If I know that there at least 2 girls in the family, the probability that the oldest child is a girl is ¾.3465

Let me recap that problem here.3473

We are given that there at least 2 girls.3476

I will set that up as my event A.3478

We want to find the probability that the oldest child is a girl.3481

I set up one event for that, the oldest child being a girl.3484

I see that I called that B1, that should have been B2.3488

Let me change that.3493

I will get some good comments saying you got the wrong B1 there.3495

I have made that a B2.3502

B1 is that the oldest child is a girl.3505

B2 is that the oldest child is a boy.3508

That divides my world into two disjoint events.3511

The event of there are being at least 2 girls, that overlaps both of them.3520

We want to find the probability that the oldest child is a girl given that there are at least 2 girls.3525

That is P of B1 given A and then using my Bayes’ formula for two events,3533

this is copied off the second slide of the lecture.3540

You can go back and just look up this formula.3544

That expands into A given B1 × B1, A given B1 × B1, A given B2 × B2.3546

I want to fill in those probabilities and that is probably the trickiest part here.3554

The probability of B1 is the probability that the oldest child is a girl.3558

No problem, that is ½.3562

The probability that the oldest child is a boy, that is also ½.3564

A given B1, that is a little trickier.3568

That is what I was working out here.3571

That says, what is the probability of getting at least 2 girls given that the oldest child is a girl?3573

If the oldest child is guaranteed to be a girl, to get at least 2 girls, you have to get at least 1 girl in the next 2 children.3581

At least 1 of the younger 2 children must be a girl.3592

And then I wrote out the possibilities that these are the younger 2 children that I wrote out here, the younger 2.3595

The other 2 children could be girl – boy, girl – girl, boy – girl, or boy- boy.3604

How many of those have at least one of them being a girl?3611

There are 3 of those possibilities, that is 3 out of 4.3614

That is where this ¾ came from and that is where this ¾ came from.3618

A given B2, this one right here means that the probability of getting at least 2 girls given that the oldest child is a boy.3623

If you want 2 girls and the oldest child is a boy, then both of the younger 2 must be girls, in order to get 2 girls.3638

There is only one way to get that, that is right there.3646

That is why we have a 1 out of 4 chance.3649

That is where that ¼ comes from.3652

We have put all the fractions in.3656

It is just a simple matter of canceling all the denominators and multiply it by 2/2 to get rid of all the halves.3658

Multiplying by 4/4 to get rid of the 4th and turn into 3/3 + 1 which simplifies down to ¾.3665

If we know that there at least 2 girls, the probability that the oldest child is a girl is now ¾.3673

That wraps up the lecture in the examples on Bayes’ rule.3681

It is a pretty complicated rule and it is very counterintuitive.3685

I strongly recommend using the formula for Bayes’ rule even when it leads you to answers that might counteract your intuition.3689

Certainly, some of these answers seem kind of surprising to me.3696

The formula is guaranteed to be true.3700

When you set things up right, you set up your disjoint events, and you setup your overlapping event3704