WEBVTT mathematics/probability/murray
00:00:00.000 --> 00:00:05.000
Hi and welcome back to the probability lectures here on www.educator.com.
00:00:05.000 --> 00:00:09.100
Today, we are going to talk about the negative binomial distribution.
00:00:09.100 --> 00:00:12.800
My name is Will Murray, let us jump right on in.
00:00:12.800 --> 00:00:20.400
The negative binomial distribution describes a sequence of trials, each of which can have two outcomes, success or failure.
00:00:20.400 --> 00:00:23.600
You want to think about this as like flipping a coin.
00:00:23.600 --> 00:00:30.100
It is very much like the geometric distribution except that we are going to keep flipping a coin,
00:00:30.100 --> 00:00:34.200
we are going to keep running these trials until we get R successes.
00:00:34.200 --> 00:00:39.200
R is some predetermined constant positive number.
00:00:39.200 --> 00:00:47.100
That R is a constant that we have decided in advance.
00:00:47.100 --> 00:00:54.000
For example, we might say I'm going to keep flipping this coin until I have seen heads 5 ×.
00:00:54.000 --> 00:01:01.300
I keep flipping a coin and I got tails, heads, tails, tails, heads, tails, tails, heads, heads, heads.
00:01:01.300 --> 00:01:05.000
I stopped as soon as I see that 5th head.
00:01:05.000 --> 00:01:07.800
It is different from the binomial distribution.
00:01:07.800 --> 00:01:11.700
You know it sounds like the binomial distribution.
00:01:11.700 --> 00:01:18.100
The binomial distribution, we decide a head of time how many times we are going to flip a coin
00:01:18.100 --> 00:01:21.800
and then we keep track of the number of heads after it is all over.
00:01:21.800 --> 00:01:26.700
With the negative binomial distribution, we decide ahead of time
00:01:26.700 --> 00:01:34.200
how many heads we want to see and we flip for as long as it takes to get that number of heads.
00:01:34.200 --> 00:01:39.800
It is actually the negative binomial distribution is actually more similar to the geometric distribution.
00:01:39.800 --> 00:01:45.700
If you take R = 1 that means we are going to keep flipping a coin until we see the first head
00:01:45.700 --> 00:01:48.700
and that is exactly the geometric distribution.
00:01:48.700 --> 00:01:56.100
In some sense, the negative binomial distribution is a generalization of the geometric distribution.
00:01:56.100 --> 00:02:02.500
By the way, if you have not watched the lectures on binomial and geometric distribution, you probably one want to watch those first.
00:02:02.500 --> 00:02:06.900
The geometric distribution is the one just before this video.
00:02:06.900 --> 00:02:10.600
Just click back, watch the lecture on the geometric distribution.
00:02:10.600 --> 00:02:18.800
Once you understand that very well, it will be time to come back and look at the binomial distribution today.
00:02:18.800 --> 00:02:23.200
I have been talking about this in terms of flipping a coin.
00:02:23.200 --> 00:02:33.500
It does not have to be a coin flip, it can be any kind of situation where you have a sort of binary outcome either you have a success or failure.
00:02:33.500 --> 00:02:39.800
For example, you could be rolling a dice, let us say you are rolling a dice and trying to get a 6.
00:02:39.800 --> 00:02:44.100
Let us say you want to get a certain number of 6’s when you roll this dice.
00:02:44.100 --> 00:02:47.200
You keep rolling and rolling and rolling, each time you roll,
00:02:47.200 --> 00:02:56.300
you either record it as 6 which would be a success, or as not a 6 which would count as a failure.
00:02:56.300 --> 00:02:59.300
It does not have to be flipping a coin, it could be rolling a dice,
00:02:59.300 --> 00:03:07.100
it could be watching your favorite team compete and try to win the World Series every year.
00:03:07.100 --> 00:03:12.700
Maybe you like the New York Yankees and you want to see will the Yankees win the World Series this year,
00:03:12.700 --> 00:03:16.300
will they win the World Series next year, will they win the World Series the year after that?
00:03:16.300 --> 00:03:21.900
Each year, either they win the World Series which you would consider a success, if you are a fan of the Yankees,
00:03:21.900 --> 00:03:24.900
or you consider it a failure if they do not win.
00:03:24.900 --> 00:03:38.200
All of these different situations can essentially be described by the same mathematical process which is the negative binomial distribution.
00:03:38.200 --> 00:03:40.700
Let us go ahead and look at the formulas for that.
00:03:40.700 --> 00:03:44.200
There are several fixed parameters before you start.
00:03:44.200 --> 00:03:47.600
P is the probability of success on each trial.
00:03:47.600 --> 00:03:52.700
For flipping a coin and it is a fair coin, it is not loaded then P would be ½
00:03:52.700 --> 00:03:56.800
because that is your probability of getting a head each time you flip a coin.
00:03:56.800 --> 00:04:01.800
If you are rolling a dice and you are trying to get a 6 then P would been 1/6.
00:04:01.800 --> 00:04:09.400
If you are watching the New York Yankees try to win the World Series, I do not know what the exact probability is,
00:04:09.400 --> 00:04:15.600
but let us say it is 1/10 because on average, every 10 years they win the World Series.
00:04:15.600 --> 00:04:18.200
Q is the probability of failure.
00:04:18.200 --> 00:04:23.400
You do not really need to know that a head of time because Q is always equal to 1 - P.
00:04:23.400 --> 00:04:27.100
You can always just fill in 1 - P for Q.
00:04:27.100 --> 00:04:35.200
R is the number of successes that you want to see and that is the number that you have to decide on ahead of time.
00:04:35.200 --> 00:04:42.300
In the probability class, that is a number that you should figure out from the problem that you have been given somehow.
00:04:42.300 --> 00:04:48.700
You decide ahead of time, when I'm flipping a coin, I want to see 3 heads.
00:04:48.700 --> 00:04:51.700
If I'm keeping track of the Yankees winning the World Series,
00:04:51.700 --> 00:04:58.400
I want to know how long will it be until they have 5 more World Series championships.
00:04:58.400 --> 00:05:04.600
The random variable that they are watching for the negative binomial distribution is the number of trials,
00:05:04.600 --> 00:05:07.300
in order to get R successes.
00:05:07.300 --> 00:05:13.100
We are going to run this experiment until we get R successes and when we get that last one,
00:05:13.100 --> 00:05:19.800
we stop and we count how many trials it took to take to get those successes.
00:05:19.800 --> 00:05:23.200
Finally, we have the formula for the probability distribution.
00:05:23.200 --> 00:05:30.100
It is Y -1 choose R -1, that is not a fraction that is the binomial coefficient.
00:05:30.100 --> 00:05:34.200
That is the formula for combinations, let me write it out here.
00:05:34.200 --> 00:05:48.500
Y -1!/R -1! and then Y -1 - R -1 which would be Y - R!.
00:05:48.500 --> 00:05:50.500
That is what that coefficient means.
00:05:50.500 --> 00:06:00.900
P ⁺R and Q ⁺Y – R, notice I changed the possible range of values Y can take.
00:06:00.900 --> 00:06:07.200
That is because, if you are looking for R successes, you know it is going to take at least R trials.
00:06:07.200 --> 00:06:12.200
That is why instead of saying Y greater than or equal to 1, we change that to R.
00:06:12.200 --> 00:06:17.500
If you are flipping a coin until you get 5 heads, you know at least it is going to take 5 flip.
00:06:17.500 --> 00:06:24.600
There is no point in even asking, what is the probability of it happening in 4 flips?
00:06:24.600 --> 00:06:31.500
If you want the Yankees to win the World Series 15 ×, there is no point in asking
00:06:31.500 --> 00:06:35.200
whether that is going to happen in the next 10 years because it cannot.
00:06:35.200 --> 00:06:40.200
It can take them at least 15 years to win 15 more World Series.
00:06:40.200 --> 00:06:48.700
That is the probability distribution and there is some more quantities associated with this that we need to know.
00:06:48.700 --> 00:06:56.900
By the way, I want to continue to highlight the fact that there are two different P's in this formula and
00:06:56.900 --> 00:06:59.400
they are representing different things.
00:06:59.400 --> 00:07:02.400
That is kind of unfortunate.
00:07:02.400 --> 00:07:05.400
There is that P and then there is that P.
00:07:05.400 --> 00:07:21.700
This P right here is the probability of Y trials overall, in order to get a certain number of successes.
00:07:21.700 --> 00:07:28.600
This P right here, represents the probability of success on each trial.
00:07:28.600 --> 00:07:41.400
Make sure you do not mix up those two P’s because that is a good way to fail a probability class.
00:07:41.400 --> 00:07:47.600
Let us keep moving, let us learn some key properties of the negative binomial distribution.
00:07:47.600 --> 00:07:51.300
There is the mean which is the expected value.
00:07:51.300 --> 00:07:56.800
Remember, mean and expected value always are synonymous with each other, those are the same thing.
00:07:56.800 --> 00:08:03.300
Expected value is just R/P.
00:08:03.300 --> 00:08:09.900
The variance of the negative binomial distribution is RQ/P².
00:08:09.900 --> 00:08:15.200
The standard deviation is always the square root of the variance.
00:08:15.200 --> 00:08:21.000
Usually, the way you calculate it is by calculating the variance first and then taking the square root.
00:08:21.000 --> 00:08:30.600
Once you have the variance, you just take the square root of that to get the standard deviation, √ RQ/P.
00:08:30.600 --> 00:08:35.000
Let us practice using the negative binomial distribution.
00:08:35.000 --> 00:08:42.800
In our first example, we are going to draw cards from a deck and we want to see 4 aces.
00:08:42.800 --> 00:08:49.700
A question you should always ask with this kind of selection problem is whether there is replacement or not replacement.
00:08:49.700 --> 00:08:54.100
Meaning, after I draw a card out of the deck, do I put it back in the deck or
00:08:54.100 --> 00:08:58.200
do I just hang onto it and then draw a different card the next time.
00:08:58.200 --> 00:09:04.900
In this case, we are being given that we are replacing cards back into the deck.
00:09:04.900 --> 00:09:09.000
We are going to draw a card, put it back, draw a card, put it back.
00:09:09.000 --> 00:09:15.700
The question is how long it will take to get exactly 4 aces?
00:09:15.700 --> 00:09:23.500
In particular, what is the chance that it will take exactly 20 draws, in order to get 4 aces?
00:09:23.500 --> 00:09:29.900
This is a negative binomial distribution formula, negative binomial distribution.
00:09:29.900 --> 00:09:32.900
Let me identify the parameters that we are dealing with here.
00:09:32.900 --> 00:09:42.800
The probability of getting an ace on any given draw, there are 4 aces in there out of 52 possible cards, that is just 1/13.
00:09:42.800 --> 00:09:50.800
Q is always 1- P, that is 1 -1/13 is 12/13.
00:09:50.800 --> 00:09:55.700
The R in this case is the number of aces that we want to get.
00:09:55.700 --> 00:10:03.000
We want to get 4 aces here, we are going to stop the experiment after we find that 4th ace.
00:10:03.000 --> 00:10:10.800
The value of Y that we are interested in, Y is the number of × they were going to have to draw.
00:10:10.800 --> 00:10:15.700
We are interested in the probability that we will draw exactly 20 ×.
00:10:15.700 --> 00:10:19.300
Let me remind you of the distribution formula.
00:10:19.300 --> 00:10:29.000
P of Y is equal to Y -1 choose R – 1, this is the negative binomial distribution formula that I gave you a couple slides ago.
00:10:29.000 --> 00:10:31.700
P ⁺R Q ⁺Y - R.
00:10:31.700 --> 00:10:34.500
I’m going to fill in all those values that I recorded above.
00:10:34.500 --> 00:10:43.000
Y -1 is 19, R -1 is 4 -1 is 3, P is 1/13.
00:10:43.000 --> 00:10:54.800
1/13 ⁺R is 4 and Q is 12/13 ⁺Y - R, 20 - 4 is 16.
00:10:54.800 --> 00:10:58.500
This simplifies a bit but it is not going to get much nicer.
00:10:58.500 --> 00:11:08.700
19 choose 3, 1/13 ⁺R there but my R was 4.
00:11:08.700 --> 00:11:13.700
It looks like I’m going to have 13 ⁺20 in the denominator.
00:11:13.700 --> 00:11:19.000
In the numerator, I’m going to have 12 ⁺16.
00:11:19.000 --> 00:11:25.900
I did not bother to find the decimal for that, it would be a very small number because it is not very likely
00:11:25.900 --> 00:11:35.300
that you will draw exactly 20 ×, that you will get your 4th ace on exactly the 20th draw.
00:11:35.300 --> 00:11:40.600
I would just leave the answer in that form and present that as my answer.
00:11:40.600 --> 00:11:42.900
Let me show you the steps involved there.
00:11:42.900 --> 00:11:48.100
First, you realize this is a negative binomial distribution problem
00:11:48.100 --> 00:11:55.000
because we are running trials over and over again until we get a certain fixed number of successes.
00:11:55.000 --> 00:11:59.800
We do not know how many trials we are going to run ahead of time, that is Y it is not a binomial distribution.
00:11:59.800 --> 00:12:04.300
But we do know how many successes we are going to have.
00:12:04.300 --> 00:12:09.800
We want to get 4 aces, that is where my R = 4 comes from, that 4 right there.
00:12:09.800 --> 00:12:13.500
The probability of getting an ace is 4 out of 52.
00:12:13.500 --> 00:12:18.100
Because there are 4 aces in a deck out of 52 cards and that is 1/13.
00:12:18.100 --> 00:12:20.400
Q is always 1 – P.
00:12:20.400 --> 00:12:25.500
Since, we are interested in drawing exactly 20 ×, I'm going to use Y = 20.
00:12:25.500 --> 00:12:39.700
This is my negative binomial distribution formula and I will just drop all the numbers in there and simplify down to not very pleasant fraction there.
00:12:39.700 --> 00:12:48.400
In example 2, we got the Akron Arvarks are going out for chinchilla grooming championship.
00:12:48.400 --> 00:12:54.000
Apparently, each year they have a 10% chance of winning the championship.
00:12:54.000 --> 00:13:01.900
They are being rather optimistic at home, they built themselves a trophy case with space for 5 championship trophies.
00:13:01.900 --> 00:13:08.800
We want to know how long we will have to wait until that trophy case is completely full.
00:13:08.800 --> 00:13:15.200
In particular, we want to find the mean and the standard deviation of Y there.
00:13:15.200 --> 00:13:22.600
Once again, this is a negative binomial distribution problem because each year they are going to go out
00:13:22.600 --> 00:13:26.700
and they are either going to win a trophy or they would not win a trophy.
00:13:26.700 --> 00:13:32.200
We are interested in how long it will take to get 5 trophies?
00:13:32.200 --> 00:13:39.100
We have to win 5 × to fill their trophy case.
00:13:39.100 --> 00:13:44.000
Let me identify the parameters here for the negative binomial distribution.
00:13:44.000 --> 00:13:49.200
The probability that they will win on any given year is 10%, that is 1/10.
00:13:49.200 --> 00:13:56.600
That means the Q is always the probability of failure 1- P is 9/10.
00:13:56.600 --> 00:14:01.400
The number of × that they want to win, that R, that is 5.
00:14:01.400 --> 00:14:10.400
We want to find the mean and the standard deviation of the time for them to win.
00:14:10.400 --> 00:14:13.600
Let me remind you of the formula for the mean.
00:14:13.600 --> 00:14:16.600
We had this on one of the earlier slides.
00:14:16.600 --> 00:14:19.200
If you check back a couple slides ago, you will see this.
00:14:19.200 --> 00:14:25.400
It was R/P, I will go ahead and write down the formula for the variance.
00:14:25.400 --> 00:14:31.100
V of Y is RQ/P².
00:14:31.100 --> 00:14:42.000
In this case, our R is 5, our P is 1/10, 5 divided by 1/10, flip on the denominator is 50.
00:14:42.000 --> 00:14:46.400
That is the expected time to fill up their trophy case.
00:14:46.400 --> 00:14:49.900
That, by the way, is a very intuitive answer.
00:14:49.900 --> 00:14:58.200
It definitely conforms to your intuition which is that while on average, they have a 1/10 chance of winning each year.
00:14:58.200 --> 00:15:02.800
On average, they are going to win about once every 10 years.
00:15:02.800 --> 00:15:09.500
If they want to win 5 ×, we expect it to take about 50 years for them to bring home 5 trophies.
00:15:09.500 --> 00:15:12.000
Let us keep going with the variance here.
00:15:12.000 --> 00:15:23.700
RQ is 5 × 9/10, P² is 1/100.
00:15:23.700 --> 00:15:36.100
If I flip, I get 100 × 45/10 which simplifies down to 450, that was the variance, that is not the standard deviation.
00:15:36.100 --> 00:15:42.800
The way you get the standard deviation is you calculate the variance first and then you take it square root.
00:15:42.800 --> 00:15:47.200
Let me go ahead and label what I’m doing in each step here.
00:15:47.200 --> 00:15:59.800
That was the mean, this is the variance, and now I'm about to calculate the standard deviation.
00:15:59.800 --> 00:16:08.500
The standard deviation is always the square root of the variance, √ 450.
00:16:08.500 --> 00:16:12.700
That simplifies a little bit, I can take a 9 out of there right away.
00:16:12.700 --> 00:16:17.500
When it comes outside, it will be a 3, 450 is 9 × 50.
00:16:17.500 --> 00:16:24.400
I can take a factor 25 out of 50, pull out the square root, I’m going of another 5 outside the square root.
00:16:24.400 --> 00:16:28.800
15 × √ 2.
00:16:28.800 --> 00:16:35.000
That does not simplify anymore, but I did throw that into my calculator,
00:16:35.000 --> 00:16:48.800
and got a decimal approximation, it was 21.21 years.
00:16:48.800 --> 00:16:57.500
What that tells me is if Akron arvarks have built this lovely new trophy case with space for 5 trophies
00:16:57.500 --> 00:17:02.800
and we want to know how long you are going to have to wait until their trophy case is full,
00:17:02.800 --> 00:17:05.600
on average they are going to have to wait about 50 years.
00:17:05.600 --> 00:17:13.100
The standard deviation on that estimate is 21.21 years.
00:17:13.100 --> 00:17:15.400
Let me go back over those steps.
00:17:15.400 --> 00:17:22.100
This is a negative binomial problem, we want to identify the probability of winning on any given year.
00:17:22.100 --> 00:17:27.400
That is 1/10, that comes from the 10% right here.
00:17:27.400 --> 00:17:37.700
That means the probability of losing the Q is 9/10 and we want to win 5 years total.
00:17:37.700 --> 00:17:41.000
That is why we put in our R is equal to 5 there.
00:17:41.000 --> 00:17:44.900
I recalled the formulas for the mean and variance.
00:17:44.900 --> 00:17:49.900
I got these back off on the slides just a little bit earlier in the lecture.
00:17:49.900 --> 00:17:54.800
Just flip back a couple of slides in the lecture and you will see these formulas for the mean and the variance.
00:17:54.800 --> 00:17:58.400
We just drop in the numbers for R, Q, and P.
00:17:58.400 --> 00:18:01.400
We get the mean of 50 years.
00:18:01.400 --> 00:18:08.600
Q or the variance gives me 450 and that is not the answer, that is not the standard deviation.
00:18:08.600 --> 00:18:12.700
To get the standard deviation, you take the square root of the variance.
00:18:12.700 --> 00:18:29.500
√ 450 simplifies down to 21.21 years, that is the standard deviation of waiting time for their trophy case to be full.
00:18:29.500 --> 00:18:33.800
In example 3 here, we are going to roll a dice until we get four 6.
00:18:33.800 --> 00:18:39.800
I just keep rolling and then I just keep track of the number of times I have seen a 6.
00:18:39.800 --> 00:18:46.000
I want to find the mean and the standard deviation of the number of rolls we will make.
00:18:46.000 --> 00:18:55.900
Let us first recognize this as a negative binomial distribution because every time we roll a dice,
00:18:55.900 --> 00:19:01.400
either we get a 6 and we call that a success or we do not get a 6 and we call that a failure.
00:19:01.400 --> 00:19:06.000
Roll, roll, until we get a 6 that is a success, do not get a 6 it is a failure.
00:19:06.000 --> 00:19:11.900
We want to get four 6 total, we want to get 4 success.
00:19:11.900 --> 00:19:17.800
Let me record the parameters of this distribution.
00:19:17.800 --> 00:19:23.700
Our probability of getting a success, our probability of rolling a 6 when we roll a dice is 1/6.
00:19:23.700 --> 00:19:32.200
Q is always 1- P, that is the probability of failure is 5/6.
00:19:32.200 --> 00:19:39.300
R is the number of successes we want to get here, when we get four 6, that is 4.
00:19:39.300 --> 00:19:48.500
Our expected value of Y is always, this is the mean, that is always R/P.
00:19:48.500 --> 00:19:52.700
I told you that on the third slide of this lecture, the same time
00:19:52.700 --> 00:20:04.700
that I told you the variance of the negative binomial distribution is V of Y is RQ/P².
00:20:04.700 --> 00:20:06.200
Let me go ahead and fill in the numbers here.
00:20:06.200 --> 00:20:18.100
R/P is 4 divided by 1/6, do the flip there that is 4 × 6 is 24, that was our mean.
00:20:18.100 --> 00:20:21.100
That tells us on average, it will take us 24 rolls.
00:20:21.100 --> 00:20:29.100
Not at all surprising there because on average, we are going to roll a 6 once every 6 rolls.
00:20:29.100 --> 00:20:32.400
If I want to get four 6, it will take 24 rolls.
00:20:32.400 --> 00:20:36.400
The variance is not intuitively obvious.
00:20:36.400 --> 00:20:46.800
R is 4, Q is 5/6, let us multiply it by 5/6, that is not a mixed number 4 and 5/6.
00:20:46.800 --> 00:20:54.900
4 × 5/6 and P² is 1/6², 1/36.
00:20:54.900 --> 00:21:03.900
If I do the flip there, I will get 36 × 4 × 5/6.
00:21:03.900 --> 00:21:09.100
I can simplify that 36 into a 6, that 6 goes away.
00:21:09.100 --> 00:21:19.200
6 × 4 × 5 is 6 × 4 is 24 × 5 is 120.
00:21:19.200 --> 00:21:22.000
That is variance, not standard deviation.
00:21:22.000 --> 00:21:28.900
To get standard deviation, you just take the square root of the variance, that is always true.
00:21:28.900 --> 00:21:38.500
You usually compute the variance first and then just take its square root, √ 120.
00:21:38.500 --> 00:21:43.100
120 has factor of 4, I can pull 2 out of there.
00:21:43.100 --> 00:21:48.700
It leaves me with 30 under the square root and that does not really simplify anymore
00:21:48.700 --> 00:21:52.800
but I did put that in my calculator, before I started this.
00:21:52.800 --> 00:21:59.700
What my calculator told me was 2 √30 is approximately 10.95.
00:21:59.700 --> 00:22:11.600
My units there are the number of rolls that I'm going to have to do, in order to see four 6.
00:22:11.600 --> 00:22:15.100
That wraps up that one, let me recap the steps there.
00:22:15.100 --> 00:22:21.800
We identify that as a negative binomial distribution because it is a process we are repeating over and over,
00:22:21.800 --> 00:22:25.600
until we get success a certain number of times.
00:22:25.600 --> 00:22:35.300
In this case, success is defined as rolling a 6 and we want to get four 6, 4 successes.
00:22:35.300 --> 00:22:41.700
I have identified my parameters there, P is the probability of rolling a 6 on any given roll, that is 1/6.
00:22:41.700 --> 00:22:51.600
Q is always 1- P and R is the number of success that we are looking for, that is 4, we were told.
00:22:51.600 --> 00:22:54.900
That is why I got R = 4.
00:22:54.900 --> 00:23:02.500
The mean of the negative binomial distribution and the variance of the negative binomial distribution,
00:23:02.500 --> 00:23:08.500
those are formulas that I gave you on one of the earlier slides, in terms of R and P, and Q.
00:23:08.500 --> 00:23:12.800
The mean is R/P, variance is RQ/P².
00:23:12.800 --> 00:23:16.800
I just have to drop in the numbers that I had already written down above.
00:23:16.800 --> 00:23:22.100
The P is 1/6, Q is 5/6, R is 4.
00:23:22.100 --> 00:23:24.400
Drop those in and simplify them down.
00:23:24.400 --> 00:23:33.100
I got a mean of 24 rolls which is very intuitive because if you get a 6 every 6 rolls on average,
00:23:33.100 --> 00:23:37.700
and you want to get four 6, it is going taking 24 rolls on average.
00:23:37.700 --> 00:23:48.700
The variance is less intuitive, we just kind of follow the formula and these number simplify down to 120.
00:23:48.700 --> 00:23:52.900
To get the standard deviation, you always take the square root of the variance and
00:23:52.900 --> 00:24:02.300
that simplifies down to an approximation of 10.95 rolls.
00:24:02.300 --> 00:24:10.800
In example 4 here, we got a company which is interviewing applicants for a job.
00:24:10.800 --> 00:24:13.800
Apparently, the company has 3 positions to fill.
00:24:13.800 --> 00:24:20.700
Perhaps, they are looking for 3 programmers, they are all identical positions.
00:24:20.700 --> 00:24:28.500
They start interviewing people and it turns out that exactly 10% of all the possible applicants
00:24:28.500 --> 00:24:31.300
actually have the qualifications for the job.
00:24:31.300 --> 00:24:40.100
They actually know the right programming languages and they have the other skills necessary to do the job.
00:24:40.100 --> 00:24:48.100
Every time we interview somebody, there is a 10% chance that they will be good enough and we will hire them.
00:24:48.100 --> 00:24:53.700
We are going to keep interviewing and interviewing and interviewing one person at a time until we get 3 good people.
00:24:53.700 --> 00:24:59.600
I will hang onto them and at that point, we will close the door, everybody else has to wait.
00:24:59.600 --> 00:25:00.500
There are two questions here.
00:25:00.500 --> 00:25:05.700
What is the probability that they will interview exactly 10 applicants and then the probability
00:25:05.700 --> 00:25:09.800
that they will interview at least 10 applicants?
00:25:09.800 --> 00:25:15.800
Let us identify this as a negative binomial distribution.
00:25:15.800 --> 00:25:24.300
We are doing trials, each one has success or failure, meaning we talk to an applicant.
00:25:24.300 --> 00:25:25.900
If they have the skills, we hire them.
00:25:25.900 --> 00:25:29.600
If not, we show them the door.
00:25:29.600 --> 00:25:36.700
Because we are looking for 3 successes, we are looking for 3 good people right now.
00:25:36.700 --> 00:25:44.100
That kind of plays into the definition of negative binomial distribution, when you are looking for a fixed number of successes
00:25:44.100 --> 00:25:51.700
and you will just keep interviewing and keep looking and looking and looking, keep running trials until you get those 3 successes.
00:25:51.700 --> 00:25:53.700
Let me identify the parameters here.
00:25:53.700 --> 00:26:02.500
First of all, the probability of getting a success on any given trial is 1/10, that is because 10% of the applicants have the skills.
00:26:02.500 --> 00:26:09.800
Q is always 1- P, in this case 1 -1/10 is 9/10.
00:26:09.800 --> 00:26:12.600
R is the number of successes you are looking for.
00:26:12.600 --> 00:26:17.500
In this case, we are looking for 3 people.
00:26:17.500 --> 00:26:28.700
For part A, we are trying to interview exactly 10 applicants.
00:26:28.700 --> 00:26:37.900
We want to use Y is equal to 10, let me remind you of the formula for the negative binomial distribution.
00:26:37.900 --> 00:26:50.300
It is always Y -1 choose R -1 × P ⁺R × Q ⁺Y – R.
00:26:50.300 --> 00:26:58.300
Let me fill in the numbers because I think I have identified all the values of those numbers there.
00:26:58.300 --> 00:27:12.800
Y was 10 in this case, let me go ahead and say P of 10 is Y -1 is 9, R -1 is 3 -1 is 2.
00:27:12.800 --> 00:27:29.100
P is 1/10, 1/10 ⁺R is 3 and Q is 9/10, Y - R is 10 -3 is 7.
00:27:29.100 --> 00:27:39.900
This simplifies a little bit, it is not great but 9 choose 2, I can simplify that as 9 × 8 divided by 2.
00:27:39.900 --> 00:27:44.500
That is because there is a 7! top and bottom that get canceled there.
00:27:44.500 --> 00:27:57.600
That is 9 ×, let us se, 9 × 4 is 36 and this is 36.
00:27:57.600 --> 00:28:04.500
It looks like I have a 9⁷ in the numerator and a 10 ⁺10 in the denominator.
00:28:04.500 --> 00:28:08.400
And that does not really seem like it is going to get any better.
00:28:08.400 --> 00:28:10.700
I’m just going to leave that the way it is.
00:28:10.700 --> 00:28:15.100
You could find a decimal for that, I did not bother to put that into our calculator
00:28:15.100 --> 00:28:22.400
and convert it to decimal because it was not a very revealing answer that I got.
00:28:22.400 --> 00:28:24.800
It would be a fairly small decimal.
00:28:24.800 --> 00:28:31.900
You do not expect to interview exactly 10 people and get lucky enough to get 3 good people in those first 10.
00:28:31.900 --> 00:28:33.900
That is not very likely.
00:28:33.900 --> 00:28:37.200
It looks like I run out of space to answer part B.
00:28:37.200 --> 00:28:40.800
I’m going to jump over to the next slide to answer part B.
00:28:40.800 --> 00:28:49.400
Let me recap how we answered part A before I sit for the next slide.
00:28:49.400 --> 00:28:57.400
It is a negative binomial distribution here, we want to keep interviewing people until we get exactly 3 successes.
00:28:57.400 --> 00:29:05.200
Our probability is 1/10, that is coming from this 10% chance of succeeding on any given applicant.
00:29:05.200 --> 00:29:08.600
Q is 9/10, it is always 1 – P.
00:29:08.600 --> 00:29:15.100
R = 3, that comes from 3 positions to fill.
00:29:15.100 --> 00:29:18.000
That is why we are looking for R is equal to 3 here.
00:29:18.000 --> 00:29:24.700
And then for part A, we want to interview exactly 10 applicants, that is why we are using Y is equal to 10.
00:29:24.700 --> 00:29:34.400
I just dropped the Y, R, P, Q, into my generic formula for the negative binomial distribution.
00:29:34.400 --> 00:29:39.000
I got 9 choose 2 × 1/10³, 9/10⁷.
00:29:39.000 --> 00:29:47.900
I simplified that a little bit but it did not really simplify into anything very nice.
00:29:47.900 --> 00:29:55.000
For part B, let me go ahead and jump over to the next slide and we will do that.
00:29:55.000 --> 00:30:04.600
This is still example 4, we are still trying to interview applicants to get 3 qualified people for these 3 job openings that we have.
00:30:04.600 --> 00:30:09.200
The question is what is the probability of getting at least 10 applicants?
00:30:09.200 --> 00:30:15.200
The way you want to think about that, let us think about why you would interview 10 applicants?
00:30:15.200 --> 00:30:24.400
That really means that, since you are looking for 3 people, you failed to get 3 people in the first 9 applicants.
00:30:24.400 --> 00:30:45.700
Let me write the answer to part B as the probability that we do not get 3 qualified applicants in the first 9 people we interview.
00:30:45.700 --> 00:30:49.400
That is the way to think about that.
00:30:49.400 --> 00:30:54.000
If you think about that, that means we kind of looking at those first 9 people and
00:30:54.000 --> 00:31:04.800
saying what is the chance that among those 9 people, we have fewer than 3 qualified applicants.
00:31:04.800 --> 00:31:22.500
What that really is, is among those first 9 people, in the first 9 applicants, how many winners are we looking at?
00:31:22.500 --> 00:31:35.600
If we are not going to get 3 that means we got either 0 winners, or 1 winner, or 2 winners, in the first 9 applicants.
00:31:35.600 --> 00:31:45.000
This is no longer a negative binomial distribution because now we are looking at a fixed number of people, 9 applicants.
00:31:45.000 --> 00:31:57.900
What we are really doing here is, we are now using not the negative binomial distribution but the binomial distribution.
00:31:57.900 --> 00:32:07.300
Binomial distribution not the negative binomial distribution
00:32:07.300 --> 00:32:14.700
because we are looking at a fixed number of people and asking what the probability is of getting a certain number of successes.
00:32:14.700 --> 00:32:18.800
Let me remind you of the formula for the binomial distribution.
00:32:18.800 --> 00:32:22.600
It is different from the formula for the negative binomial distribution.
00:32:22.600 --> 00:32:35.000
For the binomial distribution, P of Y is equal to N choose Y × P ⁺Y × Q ⁺N – Y.
00:32:35.000 --> 00:32:39.900
If you do not remember that, just check back two videos here on www.educator.com.
00:32:39.900 --> 00:32:46.800
I think it was two videos ago where I had a lecture on the binomial distribution.
00:32:46.800 --> 00:32:55.300
You can work through the video on the binomial distribution and then you will be ready to tackle the rest of this problem.
00:32:55.300 --> 00:32:57.600
In this case, we are talking about 9 applicants.
00:32:57.600 --> 00:33:02.400
Our N is the number of trials, N is 9.
00:33:02.400 --> 00:33:14.800
The probability of getting a winner on any particular applicant is still 1/10, Q is still 1- P is 9/10.
00:33:14.800 --> 00:33:19.700
The Y is the number of applicants we are hoping to get.
00:33:19.700 --> 00:33:25.200
This is Y equal to 0, this is Y = 1, Y =2.
00:33:25.200 --> 00:33:28.400
Let me fill in each one of those according to the formula.
00:33:28.400 --> 00:33:50.900
That is 9 choose 0 × P is 1/10, 1/10⁰ × 9/10⁹ - 0 + 9 choose 1 × 1/10¹
00:33:50.900 --> 00:34:11.600
× 9/10⁸ + 9 choose 2 × 1/10² × 9/10⁹-2 which is 7.
00:34:11.600 --> 00:34:14.800
These numbers actually do combine in a fairly pleasant way.
00:34:14.800 --> 00:34:17.800
I have worked out the fractions ahead of time and they were pretty nice.
00:34:17.800 --> 00:34:20.800
Let me go ahead and then play with this a little bit.
00:34:20.800 --> 00:34:27.700
9 choose 0, you can work that out from the formula but it is also the number of ways of choosing 0 things,
00:34:27.700 --> 00:34:32.000
and one way to choose 0 things which is just to take the empty set.
00:34:32.000 --> 00:34:44.700
That is 1 × 9⁹/9/10 + 9 choose 1, you can use the formula but that is number of ways
00:34:44.700 --> 00:34:49.500
to choose the one thing out of 9 and there is definitely 9 ways to do that.
00:34:49.500 --> 00:34:56.800
9 × 9⁸ /, it now looks like there is going to be a 10⁹,
00:34:56.800 --> 00:35:01.200
On the first one, I accidentally wrote 9 ⁺10.
00:35:01.200 --> 00:35:06.500
What I meant was 10⁹, be careful about that.
00:35:06.500 --> 00:35:11.400
Let me go back and look at 9 choose 2.
00:35:11.400 --> 00:35:21.500
9 choose 2, we work this out before, it is 9 × 8/2 which simplifies down to 36.
00:35:21.500 --> 00:35:29.700
36 × 9⁷/10⁹ in the denominator.
00:35:29.700 --> 00:35:37.100
This simplifies a bit, we can out a 10⁹ as a common denominator.
00:35:37.100 --> 00:35:44.700
It looks like all of these will have a factor of 9⁷.
00:35:44.700 --> 00:35:50.200
On the first one, it is 9⁹ so there are two 9 left, 81.
00:35:50.200 --> 00:35:57.800
81 + on the second one there is 9 × 9⁸, we pull out 9⁷ and that will be 81.
00:35:57.800 --> 00:36:04.200
There is a 36 here and I was trying to be more clever after that and it did not really work out.
00:36:04.200 --> 00:36:06.700
There was nothing good that happens.
00:36:06.700 --> 00:36:11.300
I just threw those numbers into my calculator and I got kind of a huge number here.
00:36:11.300 --> 00:36:24.400
I will just copy it down, 473513931/5 × 10⁸.
00:36:24.400 --> 00:36:29.500
What happened was there was a factor of 2 that cancel out a 10⁹ there.
00:36:29.500 --> 00:36:36.900
That is not a very revealing number but I wrote it down as a decimal.
00:36:36.900 --> 00:36:43.300
What I got was 0.947, 94.7%.
00:36:43.300 --> 00:36:50.400
That is the chance that we will interview at least 10 applicants and that is really not very surprising.
00:36:50.400 --> 00:36:53.900
You expect that chance to be pretty high.
00:36:53.900 --> 00:37:03.100
Remember, what is going on here is we are interviewing applicants until we get 3 applicants that have the necessary skills.
00:37:03.100 --> 00:37:07.900
On average, 1 in 10 people have the necessary skills.
00:37:07.900 --> 00:37:19.400
We are interviewing people and the question is, what is the chance that we have to interview at least 10 people to find 3 winners?
00:37:19.400 --> 00:37:25.200
It is pretty highly likely that we will have to interview at least 10 people.
00:37:25.200 --> 00:37:33.900
It is not very likely that you will get 3 winners out of the first 9 applicants, if we are only looking for 1 in 10, in general.
00:37:33.900 --> 00:37:39.900
I guess it is about a 5% chance that you will get all your winners in the first 9 applicants.
00:37:39.900 --> 00:37:47.200
There is almost 95% chance that it will take you interviewing at least 10 people or more.
00:37:47.200 --> 00:37:50.700
Let me recap the steps here.
00:37:50.700 --> 00:38:02.800
The way to think about this is to realize that interviewing at least 10 applicants means that if you have to talk to 10 people,
00:38:02.800 --> 00:38:06.500
that means the first 9 people did not give you enough good ones.
00:38:06.500 --> 00:38:10.900
You do not get 3 good ones out of the first 9.
00:38:10.900 --> 00:38:18.900
If you think about that, that is really asking what if I interview 9 people, what is the chance of not getting 3 winners?
00:38:18.900 --> 00:38:25.100
That is a fixed number of trials because there is 9 trials and we want to get fewer than 3 winners.
00:38:25.100 --> 00:38:28.100
We are interested in getting certain of number winners.
00:38:28.100 --> 00:38:32.800
That is binomial distribution not a negative binomial distribution anymore.
00:38:32.800 --> 00:38:37.200
Negative binomial is open ended where you just keep interviewing and interviewing,
00:38:37.200 --> 00:38:39.700
until you get a certain number of winners.
00:38:39.700 --> 00:38:46.600
Here, we are asking about 9 interviews total, what is the chance I’m not getting 3 winners?
00:38:46.600 --> 00:38:52.100
If you are not going to get 3 winners that means you got 0, or 1, or 2.
00:38:52.100 --> 00:38:57.900
We just want to add up those 3 probabilities and we are going to use the formula for the binomial distribution,
00:38:57.900 --> 00:39:00.200
not the negative binomial distribution.
00:39:00.200 --> 00:39:11.400
This is the binomial distribution here and I just took this formula, and I plugged in the different values of Y, P, and Q, into this formula.
00:39:11.400 --> 00:39:14.900
Then I simplified down the fractions.
00:39:14.900 --> 00:39:22.500
They started out to combine nicely and gave this lovely number so I just converted it into a decimal.
00:39:22.500 --> 00:39:28.000
That is the chance that you will end up having to talk to at least 10 people, and it is quite likely.
00:39:28.000 --> 00:39:35.600
If you are interviewing for 3 jobs, you want to fill all 3 jobs and only 1 in 10 people are good enough.
00:39:35.600 --> 00:39:39.300
The chances are you are going to talk to at least 10 people, probably significantly more.
00:39:39.300 --> 00:39:46.900
There is almost a 95% chance that you will talk to 10 people.
00:39:46.900 --> 00:39:52.000
This example, we are going to keep the basic premise of this example for the next problem.
00:39:52.000 --> 00:40:00.700
You want to hang on to the scenario here, this business about the company interviewing to fill 3 jobs.
00:40:00.700 --> 00:40:04.100
We are going to hang onto that, we use the same numbers and then
00:40:04.100 --> 00:40:11.900
we are going to introduce the wrinkle of how long each interval takes, each interview takes, in the next problem.
00:40:11.900 --> 00:40:15.800
In example 5 here, we are going to be referring back to example 4.
00:40:15.800 --> 00:40:23.100
If you have not just worked through example 4, what I want you to do is go back and read over the scenario from example 4,
00:40:23.100 --> 00:40:27.800
because we need that to understand example 5.
00:40:27.800 --> 00:40:35.600
What this company is doing is they are interviewing applicants for a job and they have 3 openings.
00:40:35.600 --> 00:40:42.500
They want to keep interviewing until they get 3 people worthy of their openings.
00:40:42.500 --> 00:40:51.600
Now, they are telling us that it takes them 3 hours to interview an unqualified applicant, 5 hours to interview a qualified applicant.
00:40:51.600 --> 00:40:57.800
Remember, we are going to keep interviewing until we get 3 total qualified applicants.
00:40:57.800 --> 00:41:03.600
Let us think about how long that will take and try to set up a formula for that.
00:41:03.600 --> 00:41:17.600
I’m going to set up, variable T is the time to find 3 qualified applicants.
00:41:17.600 --> 00:41:21.600
Let us think about how that would break down.
00:41:21.600 --> 00:41:30.800
If you are going to find 3 qualified applicants, remember Y is the number of applicants overall.
00:41:30.800 --> 00:41:34.700
The number of applicants that we are going to talk to overall.
00:41:34.700 --> 00:41:39.300
Some of them are qualified and some of them are not.
00:41:39.300 --> 00:41:42.500
Y is the total number of applicants.
00:41:42.500 --> 00:41:46.400
We are going to keep interviewing until we find 3 good ones.
00:41:46.400 --> 00:41:55.200
What that means is if we talk to Y people total and there are 3 good ones, then there are Y -3 bad ones.
00:41:55.200 --> 00:42:02.500
Each one of those bad people, each one of unqualified bombs is going to take us 3 hours to talk to.
00:42:02.500 --> 00:42:08.700
3 × Y -3, that is how much time we spent talking to people who are unqualified.
00:42:08.700 --> 00:42:15.200
There is also 3 good people and each one of those is going to cost us 5 hours to check them out,
00:42:15.200 --> 00:42:21.900
run the background check, and really confirm that those are good people for our job.
00:42:21.900 --> 00:42:24.000
Let me as simplify this expression.
00:42:24.000 --> 00:42:31.800
I get 3 Y - 9 + 15 which is 3 Y + 6.
00:42:31.800 --> 00:42:38.500
What I really want to do is calculate the mean and standard deviation of T, of 3 Y + 6.
00:42:38.500 --> 00:42:50.400
Let me calculate first the mean and standard deviation of Y itself.
00:42:50.400 --> 00:42:57.500
To do that, I'm going through the variance because that for me is the easy one to calculate.
00:42:57.500 --> 00:43:01.900
First, I will calculate the mean of Y, the expected value of Y.
00:43:01.900 --> 00:43:06.000
Remember, the mean and expected value are the same thing.
00:43:06.000 --> 00:43:10.400
The formula we have for that is R/P, that is one of our earlier formulas.
00:43:10.400 --> 00:43:13.200
I think it is on the third slide of this lecture.
00:43:13.200 --> 00:43:15.700
That is for the negative binomial distribution.
00:43:15.700 --> 00:43:24.400
The variance V of Y is RQ/P².
00:43:24.400 --> 00:43:27.600
We already have our values for R, P, and Q.
00:43:27.600 --> 00:43:31.600
Let me remind you what they are here.
00:43:31.600 --> 00:43:38.500
The P was the probability that any given applicant has the skills, that is 10%, that is 1/10.
00:43:38.500 --> 00:43:45.000
Q is always 1- P, that is 9/10.
00:43:45.000 --> 00:43:49.600
R is the number of successes that we are looking for.
00:43:49.600 --> 00:43:56.600
In this case, we have 3 job openings, I got that by the way from example 4, the fact that we have 3 job openings.
00:43:56.600 --> 00:44:04.600
R is 3 in this case, let me go ahead and calculate this mean and variance using those numbers.
00:44:04.600 --> 00:44:12.000
3 divided by 1/10 is 3 × 10, that is 30.
00:44:12.000 --> 00:44:26.900
R here is 3, Q is 9/10, P² is 1/10², 1/100 so that is 100 × 27/10.
00:44:26.900 --> 00:44:33.100
I’m doing to flip /100 that is 10 × 27 is 270.
00:44:33.100 --> 00:44:37.900
270, but that is the variance and the mean of Y not of T.
00:44:37.900 --> 00:44:46.700
In order to find the mean and variance of T, we have to remember a couple of rules of probability here.
00:44:46.700 --> 00:44:48.300
Let me remind you what those were.
00:44:48.300 --> 00:45:00.100
The expected value of AY + B, expectation is linear, it is A × E of Y + B.
00:45:00.100 --> 00:45:04.300
Variance is not linear, variance of AY + B.
00:45:04.300 --> 00:45:07.300
The interesting thing is that the B does not affect it at all.
00:45:07.300 --> 00:45:11.400
Essentially, variance measures how much a variable wobbles.
00:45:11.400 --> 00:45:18.300
If you take a variable and just move everything over, that does not change how much it wobbles.
00:45:18.300 --> 00:45:24.500
There is no N in the answer, it is A² × V of Y.
00:45:24.500 --> 00:45:32.600
We are going to use those two values to help us calculate the mean and the variance of T.
00:45:32.600 --> 00:45:39.700
E of T, T was 3 Y + 6, work that out up above there.
00:45:39.700 --> 00:45:50.000
E of 3 Y + 6 which using my little formula over there is, the A is 3.
00:45:50.000 --> 00:45:58.800
That is A, that is B, it is 3 E of Y + 6.
00:45:58.800 --> 00:46:06.600
The E of Y was 30, that is 3 × 30 + 6 which simplifies down to 96.
00:46:06.600 --> 00:46:17.500
Ours units here are hours, we get 96 hours is the expected time that
00:46:17.500 --> 00:46:24.400
it will take this company to interview all these people and get 3 qualified applicants.
00:46:24.400 --> 00:46:34.200
The variance, which is not really what we are looking for, we are looking for the standard deviation but it is very useful to find the variance
00:46:34.200 --> 00:46:38.600
because I can just take the square root of that to find standard deviation.
00:46:38.600 --> 00:46:53.500
The variance of 3 Y + 6, are I’m going to use my formula for A² V of Y, that is 9 × V of Y which is 9 × 270.
00:46:53.500 --> 00:46:58.800
I’m going to leave that factored because the next thing I'm going to do is take its square root.
00:46:58.800 --> 00:47:01.600
It would be easier to do if it is factored.
00:47:01.600 --> 00:47:06.200
Let me find the standard deviation now.
00:47:06.200 --> 00:47:16.100
The standard deviation of that is always the square root of the variance, V of T.
00:47:16.100 --> 00:47:23.900
√ 9 × 270, that is why I left it factored is I can pull out a 3 √ 270.
00:47:23.900 --> 00:47:29.700
I know I can pull out another 3 because there is still a factor of 9 over there.
00:47:29.700 --> 00:47:33.100
3 × 3 × √30.
00:47:33.100 --> 00:47:43.700
9 √ 30, that is not going to get any better until I pull out a calculator.
00:47:43.700 --> 00:47:55.200
I did throw that into my calculator and got 49.295 hours.
00:47:55.200 --> 00:48:02.100
That was of course an approximation, that is my standard deviation and
00:48:02.100 --> 00:48:10.100
the time that the company should budget to conduct all these interviews.
00:48:10.100 --> 00:48:14.000
That answers both of the questions that we repost there.
00:48:14.000 --> 00:48:16.800
Let me show you where those all came from.
00:48:16.800 --> 00:48:20.500
First of all, the basic parameters of this problem came from example 4.
00:48:20.500 --> 00:48:26.900
If you are a little mystified as to where these numbers up here came from, just go back and look at example 4.
00:48:26.900 --> 00:48:32.900
The premise of that problem was that 10% of the people we are interviewing are actually qualified.
00:48:32.900 --> 00:48:39.800
We have a 10% chance of success every time we invite someone in to the office.
00:48:39.800 --> 00:48:42.900
That means we have a 9/10 chance of failure.
00:48:42.900 --> 00:48:46.600
9/10 of the people do not have the right skills for this particular job.
00:48:46.600 --> 00:48:52.600
We have 3 job openings, that is where the R = 3 comes from.
00:48:52.600 --> 00:49:02.700
The tricky part here is to set up an expression for the time to find the 3 qualified people that we are looking for.
00:49:02.700 --> 00:49:09.600
If Y is the total number of people we interview then that means 3 of them are qualified.
00:49:09.600 --> 00:49:15.300
3 of them are going to get that full 5 hour interview, that is where that 3 came from.
00:49:15.300 --> 00:49:21.100
All the rest of the people are unqualified, that is Y -3 people are left over and
00:49:21.100 --> 00:49:27.800
they are going to get the shorter interview, the 3 hour interview, before we realize that they are unqualified.
00:49:27.800 --> 00:49:34.200
If you just simplify the arithmetic there, it simplifies down to 3 Y + 6.
00:49:34.200 --> 00:49:39.700
We are going to have to find the expected value and the standard deviation of that.
00:49:39.700 --> 00:49:45.700
I dredged up a couple of old and very useful rules on expectation and variance.
00:49:45.700 --> 00:49:54.000
The expected value of AY + B, since expectation is linear, it is just A × expected value of Y + B.
00:49:54.000 --> 00:49:55.400
With variance, it is not linear.
00:49:55.400 --> 00:50:01.800
What happens is the B disappears and you get A² coming out.
00:50:01.800 --> 00:50:11.200
The mean and the variance of Y by themselves, we will need those as a steppingstone to find the mean and variance of the time.
00:50:11.200 --> 00:50:17.200
These are formulas that I just got off the third slide in this video.
00:50:17.200 --> 00:50:21.400
Just check that, you will see that R/P, that RQ/P².
00:50:21.400 --> 00:50:28.300
And I'm just dropping the values of P, Q, R to get the 30 and the 270 here.
00:50:28.300 --> 00:50:32.000
I have to find the mean and variance of T.
00:50:32.000 --> 00:50:38.400
Remember, T is 3 Y + 6, that is where I use my old linearity formula.
00:50:38.400 --> 00:50:48.900
I drop in 3 × expected value Y + 6, that 30 /come from here and that is how I get 96 hours.
00:50:48.900 --> 00:50:57.200
This company should plan to spend about 96 hours, if they intend to get 3 good people to fill their 3 jobs.
00:50:57.200 --> 00:51:03.400
The variance there, this 3² is 9, that is using this formula right here.
00:51:03.400 --> 00:51:09.400
The 6 just plays no role at all in there because it is like this B, that just disappears.
00:51:09.400 --> 00:51:12.600
We get 9 × 270.
00:51:12.600 --> 00:51:15.500
The standard deviation is always the square root of the variance.
00:51:15.500 --> 00:51:27.600
I do √9 × 270 and that reduced down to a decimal approximation of 49.295 hours.
00:51:27.600 --> 00:51:31.500
That wraps up this lecture on negative binomial distribution.
00:51:31.500 --> 00:51:36.400
This is part of the probability series here on www.educator.com.
00:51:36.400 --> 00:51:39.000
My name is Will Murray, thank you very much for joining us, bye.