WEBVTT mathematics/statistics/son
00:00:00.000 --> 00:00:01.600
Welcome to www.educator.com.
00:00:01.600 --> 00:00:04.100
Today we are to be talking about binomial distribution.
00:00:04.100 --> 00:00:18.700
So far we been talking about discrete probability distribution these models of what the probability space of all the samples, of all the different outcomes look like.
00:00:18.700 --> 00:00:28.700
Now the binomial distribution is the special case of these discrete probability distributions and binomial is whatever two.
00:00:28.700 --> 00:00:39.300
We are to be talking about this special case because it is actually ends up being one of the most frequent probability distributions that you end up using.
00:00:39.300 --> 00:00:51.800
In binomial distributions which are particularly interested in this 1 out of 2 outcomes like heads or tails or black or white.
00:00:51.800 --> 00:00:58.900
You know 1 out of 2 things right and how many successes which we call K number of successes and trials.
00:00:58.900 --> 00:01:02.200
Something like how many heads out of 10 tosses.
00:01:02.200 --> 00:01:17.200
Then we will do a quick review of the multiplicative rule because that can be important for dealing with discrete probability distribution in a binomial way.
00:01:17.200 --> 00:01:25.200
How many outcomes with case successes there is going to be a formula that you can use to figure this out real quickly.
00:01:25.200 --> 00:01:29.700
Then we are going to talk about probabilities in a binomial distribution.
00:01:29.700 --> 00:01:33.300
How do we actually get the spread of all those probabilities?
00:01:33.300 --> 00:01:42.000
In such a distribution of how do we find expected value, and how do we find the standard deviation?
00:01:42.000 --> 00:01:53.400
Discrete probability distributions as you know these deal with outcomes that are nameable and countable.
00:01:53.400 --> 00:01:58.300
There are an infinite number of possible outcomes.
00:01:58.300 --> 00:02:02.100
There are a discrete finite number of them.
00:02:02.100 --> 00:02:13.700
They are simple space is all the outcomes, just flat out all the outcomes but the probability distribution is taking a sample space and also finding
00:02:13.700 --> 00:02:27.500
the corresponding probabilities but frequently one of the most frequent probability distributions, you will run across is the binomial distribution.
00:02:27.500 --> 00:02:35.100
That is really going to be familiar with this particular one and all of its all of its quirks and foibles.
00:02:35.100 --> 00:02:40.200
Let us talk about the binomial distribution.
00:02:40.200 --> 00:02:48.200
Many random variables they are really counting the number of successes in an independent trial.
00:02:48.200 --> 00:02:54.100
By successes, we do not necessarily mean that it have to be like a winning trial or anything.
00:02:54.100 --> 00:03:00.200
It just means whatever you are interested in right whatever outcome out of two outcomes you are interested in,
00:03:00.200 --> 00:03:08.200
a lot of times the random variable is counting how many times that interesting event happened in a number of trials.
00:03:08.200 --> 00:03:17.100
Let us say we have 20 independent trials we could have 0 successes 1, 2, 3, 4, 5 all the way up to 20.
00:03:17.100 --> 00:03:23.600
The number of successes will be our K value and K can range from 0 to 20.
00:03:23.600 --> 00:03:33.100
Just to give you examples of some it might be counting the number of heads in a random sample of 10 flips of the coin.
00:03:33.100 --> 00:03:38.900
Here the random variable x is a number of heads.
00:03:38.900 --> 00:03:48.500
Another example, if counting the number of children who been diagnosed as autistic in a random sample of 1000 children.
00:03:48.500 --> 00:03:59.000
Here, in that case, the random variable x is number of children and diagnosed with autism.
00:03:59.000 --> 00:04:17.300
Another example, if counting the number of defective items in a sample of 20 items notice that you know when you think of the word defective,
00:04:17.300 --> 00:04:23.300
you do not really think of that as successes, but it is really what we are doing is we are counting some outcome of interest for every n trials.
00:04:23.300 --> 00:04:27.100
In this case n is 20.
00:04:27.100 --> 00:04:46.000
Here x equal number of defective items and just to round it out let us talk about what the n is here.
00:04:46.000 --> 00:04:58.300
n in this case is 10 and in this case is 1000 and n in this case is 20.
00:04:58.300 --> 00:05:04.400
Let us think about why these are called binomial situations.
00:05:04.400 --> 00:05:15.000
In each of these situations you either have a success, the events of interest or you have a failure.
00:05:15.000 --> 00:05:17.700
It is not an event of interest.
00:05:17.700 --> 00:05:30.000
Here what we would see is in all of these different situations there are two outcomes that you have.
00:05:30.000 --> 00:05:37.800
You can either have a head or tail, and they both have some probability and those probabilities add up to 1.
00:05:37.800 --> 00:05:41.200
It has to because you only have those two choices.
00:05:41.200 --> 00:05:47.900
There is here it is either being diagnosed as the probability of being diagnosed of autism and the probability of not getting a diagnosis.
00:05:47.900 --> 00:05:52.500
There is only those two outcomes.
00:05:52.500 --> 00:05:57.200
Here it is either being defective or not being defective.
00:05:57.200 --> 00:06:00.400
It is one or those two outcomes.
00:06:00.400 --> 00:06:13.500
These are binomial situations because there is 2 outcomes that are disjoint and so it is 1 or the other, right.
00:06:13.500 --> 00:06:34.100
If you add the probabilities of those outcomes, if you add the probability of outcome 1 + the probability of not outcome 1, the other one, then you should get 1.
00:06:34.100 --> 00:06:49.100
Another way to put it is the probability of one outcome is equal to 1 - the probability of not that outcome.
00:06:49.100 --> 00:06:53.300
That is the other way you can think about this.
00:06:53.300 --> 00:06:59.100
Let us briefly review the multiplicative rule.
00:06:59.100 --> 00:07:10.000
Remember, when you had to think about things like this where the proportion of adults in the US with at least a bachelor's degree is 29%.
00:07:10.000 --> 00:07:13.000
Suppose you picked for adults at random what is the probability
00:07:13.000 --> 00:07:19.000
that exactly 2 have a bachelor's degree and some of the things that we did in order to find these probabilities
00:07:19.000 --> 00:07:30.800
is we imagine having slots for these different adults and you either have the probability of getting the Masters degree.
00:07:30.800 --> 00:07:49.000
Let us say these 2 get the 29 or you have the probability of not having the Masters degrees, the other outcomes that would be flip side of the 1-.29 so that would be 71.
00:07:49.000 --> 00:08:00.600
This is one combination but this would be that they would have bachelor, bachelor, no bachelor, no bachelor.
00:08:00.600 --> 00:08:21.000
Then there are other combination of exactly 2 having a bachelor's degree and so you have to find those other combinations as well as b and b.
00:08:21.000 --> 00:08:36.300
Just to recap the multiplicative rule in order to find this particular guys outcome we would have to multiply these probabilities together.
00:08:36.300 --> 00:08:38.400
I want you notice something.
00:08:38.400 --> 00:08:43.900
We are not actually going to do this probably we have done it before, but I want you to notice that let us say we wanted
00:08:43.900 --> 00:08:57.000
to find this the probability of getting this particular outcome, although the order will change in multiplication it does not really matter what order it is.
00:08:57.000 --> 00:09:12.400
The probability of this outcome is exactly equal to the probability of this outcome.
00:09:12.400 --> 00:09:23.400
That can be important for us to keep in mind and I just like you do realize that we are multiplying these probabilities together.
00:09:23.400 --> 00:09:34.800
Just to remind ourselves a little bit more about the multiplicative rule now and it is much more likely that adults that if you pick the random adults,
00:09:34.800 --> 00:09:38.800
they would not have bachelor's degree.
00:09:38.800 --> 00:09:52.300
Which of these combinations is more likely and all 4 people having bachelor’s degrees or none of them having bachelor’s degrees.
00:09:52.300 --> 00:09:59.700
If we just think about all the fact that you know only 29% of adults in the US have at least a bachelor's degree
00:09:59.700 --> 00:10:17.900
You are going to know that this is the combination of all 4 having a bachelor's degree is much less likely than all of them not having a bachelor's degree.
00:10:17.900 --> 00:10:22.400
That make sense here and these probabilities are witness to that.
00:10:22.400 --> 00:10:33.600
Before we use the multiplicative rule it is going to be handy for us to know is exactly how many outcomes with case successes we will find.
00:10:33.600 --> 00:10:37.400
We will be looking at relatively small sample spaces.
00:10:37.400 --> 00:10:43.000
Maybe out of three coin tosses how many have exactly 2 heads?
00:10:43.000 --> 00:10:50.900
In those kind of cases we can actually list out all the possible outcomes and just count how many of these outcomes have only two heads.
00:10:50.900 --> 00:11:00.100
As soon as the sample space get a little bit bigger like 6 tosses, 7 tosses, that is 2⁶, 2⁷
00:11:00.100 --> 00:11:07.300
and you know they do not call it growing exponentially for nothing like those numbers get really big and fast.
00:11:07.300 --> 00:11:22.000
If you think about 10 coin tosses and how many of those outcomes have exactly 2 heads that is going to be way impossible for us to actually manually draw out.
00:11:22.000 --> 00:11:32.200
There is a shortcut, but before I teach you the shortcut I want you to see how it fit together with the manual way of doing it.
00:11:32.200 --> 00:11:39.400
I’m going to use this example, the bachelor's degree and here is what you are going to see.
00:11:39.400 --> 00:12:02.900
Here we have the manual lists of outcomes and so here I have to draw person number 1, 2, 3, 4.
00:12:02.900 --> 00:12:07.500
I will draw another list for 1, 2, 3, 4.
00:12:07.500 --> 00:12:17.000
It is just that we know we need it is going to be 2⁴ and that 16 and I am going to draw 8 in each column.
00:12:17.000 --> 00:12:29.300
Here we are going to start off with half of these outcomes.
00:12:29.300 --> 00:12:38.600
The first person has a bachelor's degree, half of these outcomes the first person does not have a bachelor's degree.
00:12:38.600 --> 00:12:56.700
Half of these the second person has a bachelor's degree and half of these as well the second person has a bachelor's degree.
00:12:56.700 --> 00:13:12.400
It gets to be quite a bit.
00:13:12.400 --> 00:13:40.100
Here we go, that is our entire sample space of all the different outcomes.
00:13:40.100 --> 00:13:45.400
Now each of these outcomes are not equally probable, it is not like heads and tails.
00:13:45.400 --> 00:13:50.400
We know that this outcome is much less probable than this one.
00:13:50.400 --> 00:13:53.200
This one is also much less probable than this one.
00:13:53.200 --> 00:13:58.900
We know that they are not only even but this is at least the list of all the possible outcomes.
00:13:58.900 --> 00:14:04.000
We want to know what is the probability that exactly 3 will have a bachelor's degree?
00:14:04.000 --> 00:14:11.600
It has to know how many of these outcomes have at least three that have a bachelor's degree.
00:14:11.600 --> 00:14:24.900
This one, this one, this one, this one, and that is it.
00:14:24.900 --> 00:14:37.500
4 out of 16 of the outcomes have at least three people who have a bachelor's degree.
00:14:37.500 --> 00:14:49.500
Now this does not mean that the probability that exactly 3 have a bachelor's degree is 4 out of 16 because each of these are not equally probable.
00:14:49.500 --> 00:14:53.100
It is good to know how many of them there are.
00:14:53.100 --> 00:15:10.300
There is a shortcut to get this number, this number of 4 how many outcomes with K successes and here K is 3 and n is 4 out of 4 adults.
00:15:10.300 --> 00:15:36.100
Let us put back here n is total number of trials or total number of spots, number of independent trials and K is number of successes generally.
00:15:36.100 --> 00:15:45.500
In this case K is number of bachelor's degree holders and n is for adults.
00:15:45.500 --> 00:15:59.800
We can actually use an insight from permutation combinations like probably long time ago for most of you in order to find this number 4.
00:15:59.800 --> 00:16:08.800
In fact we could use n2k, this is also written as sort of like these big parentheses nK and there is also another way you could write it words like n and c,
00:16:08.800 --> 00:16:18.500
either for choose a combination not especially sure which one.
00:16:18.500 --> 00:16:22.700
Here is the actual formula for it.
00:16:22.700 --> 00:16:32.500
If you have n choose K that is how you say that and then what you really want is how many relevant combinations
00:16:32.500 --> 00:16:39.900
can you have where you have any number of slots but you have K number of successes for those n slots.
00:16:39.900 --> 00:16:49.200
This is going to be n factorial that is like if you have three factorial that will be 3 × 2 × 1 over I always
00:16:49.200 --> 00:16:59.800
remember this picture first n - K factorial and the reason for that is that you will end up having this.
00:16:59.800 --> 00:17:22.200
If you have like 4 factorial and you have 4 – 2 let us say that will be 2 factorial that means you will start listing the factorial up until K.
00:17:22.200 --> 00:17:37.500
It is like here is our 4 slots and it could be 4 3 2 1, but you will only do the factorial up until the number of successful slots.
00:17:37.500 --> 00:17:45.900
And also K down here on the bottom that would be k is 2.
00:17:45.900 --> 00:17:58.500
This is the formulae that will give us this nice number of 4 combinations having three successes out of 4.
00:17:58.500 --> 00:18:02.100
Let us see if this works, at least for our example.
00:18:02.100 --> 00:18:08.300
In our example n is 4 so that would be 4 factorial.
00:18:08.300 --> 00:18:14.600
Let me just erase this stuff down here we do not really need it.
00:18:14.600 --> 00:18:30.100
That will be 4 factorial over n - K which is 4-3, which in the being 1 and K is 3.
00:18:30.100 --> 00:18:40.000
Oftentimes I advise people like on the SAT and stuff like you do not want to actually calculate out the factorial always
00:18:40.000 --> 00:18:45.500
because sometimes you just can cancel without having to actually calculated.
00:18:45.500 --> 00:18:50.500
This one is the factorial 1 or 1 is just 1 but we could just forget that.
00:18:50.500 --> 00:19:01.000
4 factorial or 3 factorial it is 4 × 3 × 2 × 1 / 3 × 2 × 1.
00:19:01.000 --> 00:19:05.300
I could just cross out the 3 × 2 × 1 which was actually 4.
00:19:05.300 --> 00:19:12.400
I do not have to multiply anything and guess what we got, 4.
00:19:12.400 --> 00:19:21.100
The nice thing about this boring life that you can use that when you have an inordinately high number of independent trials
00:19:21.100 --> 00:19:27.700
you do not have to actually pretend to write out 10 slots in all the different combinations
00:19:27.700 --> 00:19:35.700
and you can actually just put it into the formula and I will tell you how many outcomes with K successes there are.
00:19:35.700 --> 00:19:48.700
Once we know that now we need to put together the multiplicative rule with the number of outcomes that we learn.
00:19:48.700 --> 00:20:00.100
Here is the n choose k stuff that we learn and the multiplicative rule helps that calculate the probability of one particular outcome.
00:20:00.100 --> 00:20:11.200
You want to put those together and I would introduce a slightly different notation system
00:20:11.200 --> 00:20:21.200
before we look at the probability of some events the probability of an occurring.
00:20:21.200 --> 00:20:40.000
It is the same thing, except here is the likelihood x is our random variable and any binomial distribution we actually already know what x.
00:20:40.000 --> 00:20:55.800
It is not just a random variable X is actually the number of successes.
00:20:55.800 --> 00:21:06.200
We actually already made up a letter to symbolize number of success that is called K.
00:21:06.200 --> 00:21:10.300
x = K.
00:21:10.300 --> 00:21:24.700
K can be all sorts of discrete number straight like 123456 however many trials you have K is that many + 0 number of success.
00:21:24.700 --> 00:21:35.600
What we are really looking for is all the different probabilities where X = k and K can have a range.
00:21:35.600 --> 00:21:40.300
That is our binomial distribution.
00:21:40.300 --> 00:21:48.800
The set of all the probabilities where x=0, x=1, x=2, x=3.
00:21:48.800 --> 00:21:57.400
All those probabilities altogether, that set makes up the probability distribution that we call the binomial distribution.
00:21:57.400 --> 00:22:00.000
This is what we are looking for.
00:22:00.000 --> 00:22:11.700
now, in order to find this we have to know the probability of getting that particular outcome and that is actually quite simple
00:22:11.700 --> 00:22:23.300
because we talked about the example where we have 4 slots and 2 of them have bachelor's degree.
00:22:23.300 --> 00:22:34.800
That would be BB and n or and nn BB or B and nB.
00:22:34.800 --> 00:22:42.400
They all have exactly 2 people have bachelor's degrees and we know how to find it.
00:22:42.400 --> 00:22:45.500
They all have the same probability.
00:22:45.500 --> 00:22:53.500
How can we express this in a more abstract form?
00:22:53.500 --> 00:22:58.800
We understand that concretely how can we express it in an abstract form?
00:22:58.800 --> 00:23:02.800
There is a straightforward way of doing so.
00:23:02.800 --> 00:23:11.300
Consider that P is the probability of the K happening, whatever success rate.
00:23:11.300 --> 00:23:25.000
So the probability of success we are going to call it p for now just to shorten down the notation.
00:23:25.000 --> 00:23:27.300
How many P do we have?
00:23:27.300 --> 00:23:30.900
We have k number of P.
00:23:30.900 --> 00:23:34.000
It is p⁺k.
00:23:34.000 --> 00:23:41.600
In the case it is .29 probability of the success and K is 2.
00:23:41.600 --> 00:23:46.100
P⁺k.
00:23:46.100 --> 00:23:49.500
That accounts for this part.
00:23:49.500 --> 00:23:51.600
How do we count for this part?
00:23:51.600 --> 00:24:01.100
Well that is 1 - p because we have to account for the non-successes and how many of those non-successes do we have?
00:24:01.100 --> 00:24:14.600
We have n - K and it has 2 people out of 10 we would have 8 other slots filled by non-success.
00:24:14.600 --> 00:24:23.300
In this case we have 4 slots and 2 successes so how many of slots are filled by non-successes?
00:24:23.300 --> 00:24:25.700
4-2.
00:24:25.700 --> 00:24:32.500
This will give us the probability of exactly 1 of these combinations.
00:24:32.500 --> 00:24:40.000
Remember there is a whole bunch of the different combinations if we know how to get that number .
00:24:40.000 --> 00:24:52.700
You multiply all of that by the number of different combinations that you can have and that is n 2k.
00:24:52.700 --> 00:25:02.000
If this probability they are all here, they are all the same, that times however many of those outcomes that you have.
00:25:02.000 --> 00:25:13.200
That is the probability where x=k.
00:25:13.200 --> 00:25:22.700
You can plug in numbers for k or you do not have to plug in numbers for k but there you go expected value in stdev in a binomial distribution.
00:25:22.700 --> 00:25:32.900
Once again this binomial distribution then we will have something that looks like this.
00:25:32.900 --> 00:25:40.500
Like you can put it in a table or in a histogram but for now I will do it like this.
00:25:40.500 --> 00:25:51.500
Out of 4 results number with bachelor's degree.
00:25:51.500 --> 00:25:57.000
You have 0, 1, 2, 3, up to 4.
00:25:57.000 --> 00:26:02.300
Then you want the probability of this outcome.
00:26:02.300 --> 00:26:13.100
The probability where x=k and these are all our k.
00:26:13.100 --> 00:26:16.100
K=0, 1, 2, 3.
00:26:16.100 --> 00:26:24.500
We express this as p(x=0), p(x=1), so on.
00:26:24.500 --> 00:26:34.000
You could express this as a table chart just like all of our probability distribution.
00:26:34.000 --> 00:26:42.300
How do we find the expected of value of this probability distribution?
00:26:42.300 --> 00:26:45.000
We know how to find these.
00:26:45.000 --> 00:26:52.800
We have our formula that we have just learned and we could also reason it out.
00:26:52.800 --> 00:26:57.200
We want to know how many combination have x=0.
00:26:57.200 --> 00:26:59.800
0 number of successes.
00:26:59.800 --> 00:27:04.200
Then we want to multiply that by the probability of those successes.
00:27:04.200 --> 00:27:17.300
Here we would have all of this probabilities and we would have all these k or x=k.
00:27:17.300 --> 00:27:21.100
Then we want to know what is the expected value?
00:27:21.100 --> 00:27:25.700
What is on average basis, an average of them together?
00:27:25.700 --> 00:27:35.200
On average what would be the number of bachelor's degrees I would expect when I sample all the 4 results independently in populations?
00:27:35.200 --> 00:27:50.600
Before we had expected value of and another way of writing is μ⁺x.
00:27:50.600 --> 00:28:03.900
Before we had to do all this crazy multiplying thing but now we could think of it as n × p.
00:28:03.900 --> 00:28:13.200
P being the probability of success of whatever your success is and here this is the bachelor's degree.
00:28:13.200 --> 00:28:19.100
In this case it would be n = 4 × .29.
00:28:19.100 --> 00:28:31.800
That is the expected value of this distribution.
00:28:31.800 --> 00:28:52.300
If you use a calculator to do this, on average what is the k value on average?
00:28:52.300 --> 00:29:00.000
4 × .29 and here it says 1.16.
00:29:00.000 --> 00:29:04.600
Let us think about this.
00:29:04.600 --> 00:29:21.600
That say that on average you will have 1 being the most frequent number of adults out of 4 that will have bachelor's degree.
00:29:21.600 --> 00:29:26.000
That makes sense because it is not a super likely scenario.
00:29:26.000 --> 00:29:35.800
There is a .29% chance and in some sense if you think about that, that is close to ¼ like 25% chance.
00:29:35.800 --> 00:29:45.600
It makes sense that out of 4 people how many are you going to expect to have bachelor's degree?
00:29:45.600 --> 00:29:53.000
It is going to be 4 × .29 it is going to be ¼ of n.
00:29:53.000 --> 00:29:57.000
It is going to be 29% of n.
00:29:57.000 --> 00:30:03.300
In that way this number ends up making sense here.
00:30:03.300 --> 00:30:07.500
What about standard deviation?
00:30:07.500 --> 00:30:17.500
Previously we have talked about how to write this when we talk about the expected value .
00:30:17.500 --> 00:30:22.100
Remember expected value means it is not the mean of the population.
00:30:22.100 --> 00:30:23.800
It is not the mean of the sample.
00:30:23.800 --> 00:30:28.700
It is the mean of the probability distributions.
00:30:28.700 --> 00:30:38.500
Here this standard deviation of the probability distribution how this spread around this 1.16 value?
00:30:38.500 --> 00:30:39.700
How this spread?
00:30:39.700 --> 00:30:54.500
Here it might help to get the variance first the we will just square root this to get the stdev.
00:30:54.500 --> 00:31:03.400
In order write that it is sigma² but with the sub x down here to indicate that it is an expected value.
00:31:03.400 --> 00:31:11.000
Here we have n × p × 1 – p.
00:31:11.000 --> 00:31:13.100
We have to account for successes.
00:31:13.100 --> 00:31:15.000
We have to account for failures.
00:31:15.000 --> 00:31:17.600
You have to account for how many slots there are.
00:31:17.600 --> 00:31:25.400
Square root that whole thing to get n × p × n-1.
00:31:25.400 --> 00:31:34.700
That is the standard deviation of a binomial distribution.
00:31:34.700 --> 00:31:41.300
These are specific forms of the general form.
00:31:41.300 --> 00:31:50.000
You can always use the regular expected value in stdev that you would normally use.
00:31:50.000 --> 00:31:54.700
Multiplying across and adding them up.
00:31:54.700 --> 00:32:08.800
These are some short cuts that work because when we are talking about binomial distribution each slot has fixed probability of p, 4 successes.
00:32:08.800 --> 00:32:12.600
Because of that it puts down some of our work.
00:32:12.600 --> 00:32:34.400
One other thing to know about this is as n increases, as the sample size of n increases the binomial distribution get more and more normal.
00:32:34.400 --> 00:32:45.400
Think about it, at the very extreme we are interested in the population of n of the US.
00:32:45.400 --> 00:32:52.600
We have in our sample N-1, that is n.
00:32:52.600 --> 00:32:59.100
If that is how big our sample is, that is almost like having everybody in the US.
00:32:59.100 --> 00:33:16.100
Basically, as n gets bigger you will end approximating a normal distribution in the binomial distribution.
00:33:16.100 --> 00:33:23.700
That is helpful because it does not mean that our population is necessarily abnormal.
00:33:23.700 --> 00:33:29.400
It is just means that if we have the probability distribution that becomes more normal.
00:33:29.400 --> 00:33:36.800
This principle will become even more clear when we talk about the sampling distribution of sampling.
00:33:36.800 --> 00:33:41.000
We will get more into that later but I just want to throw that in there.
00:33:41.000 --> 00:33:46.300
Let us move on to examples.
00:33:46.300 --> 00:33:55.300
Example 1, in an all day tennis tournament each round of the competition will begin with a coin toss on the 4 different courts to determine who will serve.
00:33:55.300 --> 00:34:01.800
In any 1 round what is the probability that exactly 2 people will call their respective coin toss correctly?
00:34:01.800 --> 00:34:18.200
If you think about this, you have 4 courts and 4 coin tosses and we are looking for the probability that exactly 2 people will fall their respective coin tosses correctly.
00:34:18.200 --> 00:34:22.800
2 people will be correct and 2 people will be incorrect.
00:34:22.800 --> 00:34:25.100
This is not the only one.
00:34:25.100 --> 00:34:33.600
There are many different combinations.
00:34:33.600 --> 00:34:36.900
I’m not going to deal with all of those combinations even though I could.
00:34:36.900 --> 00:34:54.000
I would not use my n2k combinations idea in order to figure out how many outcomes will I expect this.
00:34:54.000 --> 00:35:05.400
In this case n is 4 but 2 is the number of successes.
00:35:05.400 --> 00:35:17.400
That is going to be 4^/4 – 2^.
00:35:17.400 --> 00:35:25.400
I know that this and this will be just 4 × 3 and then I can cross out 2 ×.
00:35:25.400 --> 00:35:28.800
I will put this in here 2 × 1.
00:35:28.800 --> 00:35:30.500
I do not need the 1.
00:35:30.500 --> 00:35:44.000
6 of my combinations will have exactly 2 people call their coin tosses correctly.
00:35:44.000 --> 00:35:54.200
Given that one thing that is nice about this is that all of these probabilities are exactly the same because the probability of being correct is .5
00:35:54.200 --> 00:35:59.800
and the probability of being incorrect is also .5.
00:35:59.800 --> 00:36:07.200
Just to illustrate for you I am going to put in to 1 – p form so that you could see.
00:36:07.200 --> 00:36:15.200
Here I want to know the probability that x my random variable = 2 successes.
00:36:15.200 --> 00:36:38.000
In order to find that I put in my number of outcomes that have exactly 2 successes × the probability of success which is .5 × k.
00:36:38.000 --> 00:36:48.700
1 - .5 the probability of being incorrect × n –k.
00:36:48.700 --> 00:36:50.200
These two are the same.
00:36:50.200 --> 00:36:53.200
Let us simplify this.
00:36:53.200 --> 00:37:06.600
I know that this is 6 × .5² × .5².
00:37:06.600 --> 00:37:21.600
I know I could put this together and just put 6.5⁴.
00:37:21.600 --> 00:37:33.100
Let me just get my Excel calculator here.
00:37:33.100 --> 00:37:45.400
6 × .5⁴ and I will get .375.
00:37:45.400 --> 00:38:02.500
My probability of getting exactly 2 people calling their respective coin tosses correctly is 37.5%.
00:38:02.500 --> 00:38:10.300
Example 2, given that 29% of the population of adult in the US who have a bachelor's degree or higher,
00:38:10.300 --> 00:38:18.500
create a probability table for the number of college graduate for any group of 7 randomly selected adults.
00:38:18.500 --> 00:38:23.100
What is the probability that given the sample have 5 or more college graduate?
00:38:23.100 --> 00:38:29.900
What this is asking for is the probability table that looks something like this.
00:38:29.900 --> 00:38:45.300
Here we have k number of bachelor's degree table and that would be 0, 1, 2, 3, 4, 5, 6, 7.
00:38:45.300 --> 00:38:49.600
We also want to know where the probability where x = k.
00:38:49.600 --> 00:39:00.100
We could just find out those formula.
00:39:00.100 --> 00:39:04.400
Just to give you an idea I will show you the first one.
00:39:04.400 --> 00:39:12.200
We could do probability of x = 0, 0 number of successes.
00:39:12.200 --> 00:39:14.500
That is going to be n2k.
00:39:14.500 --> 00:39:19.500
N is 7, k is 0.
00:39:19.500 --> 00:39:47.100
Just to remind you 0 factorial is just 1 not 0 × the probability of success which is 29⁰ and 1 - .29⁺n -0 which is 7.
00:39:47.100 --> 00:40:07.700
Let us look at choose and see what choose means.
00:40:07.700 --> 00:40:21.700
Unfortunately choose means literally juvenile.
00:40:21.700 --> 00:40:25.100
This is not what we want.
00:40:25.100 --> 00:40:35.900
It is useful to try combinations.
00:40:35.900 --> 00:40:48.800
Let us see what this one says the number of items like n and the number of items that each combination which is k.
00:40:48.800 --> 00:40:51.300
This is exactly what we are looking for.
00:40:51.300 --> 00:40:58.300
N^/k^ × n – k^.
00:40:58.300 --> 00:41:06.100
We want to choose for a number we want to put n, for number chosen we want to put k.
00:41:06.100 --> 00:41:07.600
We can use combine.
00:41:07.600 --> 00:41:09.300
It is great.
00:41:09.300 --> 00:41:20.100
Before I do that I am going to create a little table for myself so that I can see things.
00:41:20.100 --> 00:41:29.000
Here is 0, 1, 2, 3, 4, 5, 6, 7 so that they do not have to put in my formula again and again I can just copy and paste.
00:41:29.000 --> 00:41:44.200
P where x is = k so this k is going to be combine 7, 0, 7.
00:41:44.200 --> 00:41:55.400
I am going to make a formula so I choose that 0 × .29⁰.
00:41:55.400 --> 00:42:04.300
Excel know order of operations so it is going to do the power before multiplying.
00:42:04.300 --> 00:42:31.600
1 - .29⁷, 7 will always stay the same that is why I am just checking it in, - k^.
00:42:31.600 --> 00:42:39.400
The probability of having 0 people have bachelor's degree is 9%.
00:42:39.400 --> 00:42:45.500
I am just going to copy and paste that all the way down.
00:42:45.500 --> 00:42:55.300
We could see that 9% might look small but that is larger than all of 7 actually having bachelor's degree.
00:42:55.300 --> 00:43:04.900
We are looking for what is the probability that a sample will have 5 or more college graduate.
00:43:04.900 --> 00:43:11.500
Here we can use the addition rule to put these 3 probabilities together.
00:43:11.500 --> 00:43:15.300
Just to show you.
00:43:15.300 --> 00:43:27.100
I will just write the 5 and 6 probabilities.
00:43:27.100 --> 00:43:44.100
This is .0217.
00:43:44.100 --> 00:43:53.600
Here you put all of this and the rest but what is the probability that the sample will have 5 or more college graduates?
00:43:53.600 --> 00:44:09.200
We could put together what is the probability that x is > or = to 5 or more?
00:44:09.200 --> 00:44:26.800
That will be the probability where x is =5 + the probability where x=6 + the probability where x =7.
00:44:26.800 --> 00:44:33.200
I will bring this back if I just add this up.
00:44:33.200 --> 00:44:54.900
The probability where x is > or = to 5 + the sum of the three and we get .0248.
00:44:54.900 --> 00:45:15.600
That means a chance of 2 ½ % chance of randomly selected 7 adults and finding that at least 5 of them have a college degree.
00:45:15.600 --> 00:45:35.400
Although it seems like it will take a long time that is why I am just showing it to you I felt you could write this one for each of these rows.
00:45:35.400 --> 00:45:38.300
Excel comes in handy.
00:45:38.300 --> 00:45:48.800
40% of blood donors have type A blood, The blood bank need 2 type A donors to walk in
00:45:48.800 --> 00:45:55.300
and the blood bank will test 10 random blood donors and count the number with type A blood.
00:45:55.300 --> 00:46:00.600
If they say calculate the number with something what is the other?
00:46:00.600 --> 00:46:05.800
Then you will know binomial distributions.
00:46:05.800 --> 00:46:11.100
What is the probability that the blood bank has fewer than 2 type A donors?
00:46:11.100 --> 00:46:23.200
If they have said what is the probability that blood bank has 2 type A donors versus type B donors this one will be a binomial distribution.
00:46:23.200 --> 00:46:32.100
It could be type A, B, AB, or O but this is just the same what is the probability that they are A or not A?
00:46:32.100 --> 00:46:35.000
That is how you will know if it is a binomial distribution.
00:46:35.000 --> 00:46:39.700
What is the probability that the blood bank has fewer than 2 type A donors?
00:46:39.700 --> 00:46:54.000
It is nice to just start off with this idea that there is going to be 10 donors and 2 of them need to have type A blood.
00:46:54.000 --> 00:47:10.900
That would be the probability of anyone of these combinations would be .40, that is p⁺k × 1 – .4.
00:47:10.900 --> 00:47:24.800
60% could not be A probability × 2⁺10 -2.
00:47:24.800 --> 00:47:28.200
That is the rest of the other 8 slots.
00:47:28.200 --> 00:47:39.600
We need to know how many of these combinations we have so that would be n2k which will be 10.
00:47:39.600 --> 00:47:45.300
This will give us the probability where x=2.
00:47:45.300 --> 00:47:48.300
Is that what is this asking?
00:47:48.300 --> 00:47:50.100
No, that is not.
00:47:50.100 --> 00:47:51.300
This is not good enough.
00:47:51.300 --> 00:48:06.800
What we need to know is the probability where x= fewer than type A donor.
00:48:06.800 --> 00:48:11.800
These are the situations that they do not want.
00:48:11.800 --> 00:48:15.600
How do we get this?
00:48:15.600 --> 00:48:25.300
That is going to be the probability where x = 0 + the probability where x = 1.
00:48:25.300 --> 00:48:27.700
We combine that.
00:48:27.700 --> 00:48:40.100
If this is x = 2 we could obviously do this for x = 0 or x =1.
00:48:40.100 --> 00:48:57.200
This would be 10⁰, .40⁰, .6 the not A probability ⁺10.
00:48:57.200 --> 00:49:14.800
10 choose 1¹,.60⁹ that is the rest of the slots.
00:49:14.800 --> 00:49:53.700
I could just use my handy Excel function now that I know combine I will put 10 choose 0 × .4⁰ which is 1 × .6 ⁺10.
00:49:53.700 --> 00:50:01.300
That is the probability of getting 0 out of 10.
00:50:01.300 --> 00:50:02.900
It is pretty low.
00:50:02.900 --> 00:50:04.900
It is less than 1% chance.
00:50:04.900 --> 00:50:08.000
.24% chance.
00:50:08.000 --> 00:50:12.900
We are not in danger for that happening.
00:50:12.900 --> 00:50:19.600
Let us look at the probability of only 1% having type A blood walking in.
00:50:19.600 --> 00:50:35.700
That × .4¹ and then .6⁹ and that is a 4% chance.
00:50:35.700 --> 00:50:47.300
If we add these up what do we get is still less than 4%.
00:50:47.300 --> 00:50:58.700
This would be =.0427.
00:50:58.700 --> 00:51:10.300
A little more than 4% chance that the blood bank will get fewer than type A donors walk in.
00:51:10.300 --> 00:51:20.500
Example 4, 2.4% of students in a large state university consider themselves multiracial.
00:51:20.500 --> 00:51:26.800
In a random sample of 100 students what is the expected number of multiracial students.
00:51:26.800 --> 00:51:28.700
What is the standard deviation?
00:51:28.700 --> 00:51:38.200
This is a good one because this is definitely a case where you cannot imagine even with 10 blood donors.
00:51:38.200 --> 00:51:45.600
100 for sure is we cannot solve way too much of your life writing all the different combinations.
00:51:45.600 --> 00:51:56.800
This a good example of situations that you run into where you are going to need this binomial distribution ideas.
00:51:56.800 --> 00:52:02.500
What is the expected number of multiracial students?
00:52:02.500 --> 00:52:26.600
We could make this giant probability distribution of 0 to 100 to the expected value or we know that there are some regularities to the expected value in a binomial situation.
00:52:26.600 --> 00:52:45.800
We could also write it as μ sub x and we know that this is n × the probability of success.
00:52:45.800 --> 00:52:49.800
Here success is being multiracial.
00:52:49.800 --> 00:53:01.500
Our n is 100 and what proportion of those students will be multiracial?
00:53:01.500 --> 00:53:16.100
Just 10 × the probability of success × .024% that is 2.4.
00:53:16.100 --> 00:53:25.100
The expected number of multiracial students around 2.4 and that makes sense.
00:53:25.100 --> 00:53:27.200
What about standard deviation?
00:53:27.200 --> 00:53:36.500
That would look like this will be sub x.
00:53:36.500 --> 00:53:45.100
I would like to start with the giant square root to remind myself where I am going.
00:53:45.100 --> 00:53:49.800
You could put it at the end but sometimes I forget to put it in.
00:53:49.800 --> 00:53:59.900
N × p of being multiracial and you also have to count for probability of not being multiracial.
00:53:59.900 --> 00:54:26.700
That would be 100 × .024 × the other side of that, that is the 97.6% of not being multiracial.
00:54:26.700 --> 00:54:32.200
That is 2.4 × .976.
00:54:32.200 --> 00:54:38.200
I am just going to use my handy calculator.
00:54:38.200 --> 00:54:49.100
2.4 × .976 = 2.34.
00:54:49.100 --> 00:55:03.700
The nice thing about standard deviation is that it is always in the same unit as the μ this 2.4%.
00:55:03.700 --> 00:55:09.700
The spread is quite small having given that it is 100 students.
00:55:09.700 --> 00:55:15.000
That is it for binomial distribution, thank you for using www.educator.com.