For more information, please see full course syllabus of Probability

For more information, please see full course syllabus of Probability

### Negative Binomial Distribution

Lecture Slides are screen-captured images of important points in the lecture. Students can download and print out these lecture slide images to do practice problems as well as take notes while watching the lecture.

- Intro
- Negative Binomial Distribution
- Negative Binomial Distribution: Definition
- Prototypical Example: Flipping a Coin Until We Get r Successes
- Negative Binomial Distribution vs. Binomial Distribution
- Negative Binomial Distribution vs. Geometric Distribution
- Formula for Negative Binomial Distribution
- Key Properties of Negative Binomial
- Example I: Drawing Cards from a Deck (With Replacement) Until You Get Four Aces
- Example II: Chinchilla Grooming
- Example III: Rolling a Die Until You Get Four Sixes
- Example IV: Job Applicants
- Example V: Mean & Standard Deviation of Time to Conduct All the Interviews

- Intro 0:00
- Negative Binomial Distribution 0:11
- Negative Binomial Distribution: Definition
- Prototypical Example: Flipping a Coin Until We Get r Successes
- Negative Binomial Distribution vs. Binomial Distribution
- Negative Binomial Distribution vs. Geometric Distribution
- Formula for Negative Binomial Distribution 3:39
- Fixed Parameters
- Random Variable
- Formula for Negative Binomial Distribution
- Key Properties of Negative Binomial 7:44
- Mean
- Variance
- Standard Deviation
- Example I: Drawing Cards from a Deck (With Replacement) Until You Get Four Aces 8:32
- Example I: Question & Solution
- Example II: Chinchilla Grooming 12:37
- Example II: Mean
- Example II: Variance
- Example II: Standard Deviation
- Example II: Summary
- Example III: Rolling a Die Until You Get Four Sixes 18:27
- Example III: Setting Up
- Example III: Mean
- Example III: Variance
- Example III: Standard Deviation
- Example IV: Job Applicants 24:00
- Example IV: Setting Up
- Example IV: Part A
- Example IV: Part B
- Example V: Mean & Standard Deviation of Time to Conduct All the Interviews 40:10
- Example V: Setting Up
- Example V: Mean
- Example V: Variance
- Example V: Standard Deviation
- Example V: Summary

### Introduction to Probability Online Course

### Transcription: Negative Binomial Distribution

*Hi and welcome back to the probability lectures here on www.educator.com.*0000

*Today, we are going to talk about the negative binomial distribution.*0005

*My name is Will Murray, let us jump right on in.*0009

*The negative binomial distribution describes a sequence of trials, each of which can have two outcomes, success or failure.*0013

*You want to think about this as like flipping a coin.*0020

*It is very much like the geometric distribution except that we are going to keep flipping a coin,*0023

*we are going to keep running these trials until we get R successes.*0030

*R is some predetermined constant positive number.*0034

*That R is a constant that we have decided in advance.*0039

*For example, we might say I'm going to keep flipping this coin until I have seen heads 5 ×.*0047

*I keep flipping a coin and I got tails, heads, tails, tails, heads, tails, tails, heads, heads, heads.*0054

*I stopped as soon as I see that 5th head.*0061

*It is different from the binomial distribution.*0065

*You know it sounds like the binomial distribution.*0068

*The binomial distribution, we decide a head of time how many times we are going to flip a coin*0072

*and then we keep track of the number of heads after it is all over.*0078

*With the negative binomial distribution, we decide ahead of time *0082

*how many heads we want to see and we flip for as long as it takes to get that number of heads.*0087

*It is actually the negative binomial distribution is actually more similar to the geometric distribution.*0094

*If you take R = 1 that means we are going to keep flipping a coin until we see the first head *0100

*and that is exactly the geometric distribution.*0106

*In some sense, the negative binomial distribution is a generalization of the geometric distribution.*0109

*By the way, if you have not watched the lectures on binomial and geometric distribution, you probably one want to watch those first.*0116

*The geometric distribution is the one just before this video.*0122

*Just click back, watch the lecture on the geometric distribution.*0127

*Once you understand that very well, it will be time to come back and look at the binomial distribution today.*0130

*I have been talking about this in terms of flipping a coin.*0139

*It does not have to be a coin flip, it can be any kind of situation where you have a sort of binary outcome either you have a success or failure.*0143

*For example, you could be rolling a dice, let us say you are rolling a dice and trying to get a 6.*0153

*Let us say you want to get a certain number of 6’s when you roll this dice.*0160

*You keep rolling and rolling and rolling, each time you roll, *0164

*you either record it as 6 which would be a success, or as not a 6 which would count as a failure.*0167

*It does not have to be flipping a coin, it could be rolling a dice, *0176

*it could be watching your favorite team compete and try to win the World Series every year.*0179

*Maybe you like the New York Yankees and you want to see will the Yankees win the World Series this year,*0187

*will they win the World Series next year, will they win the World Series the year after that?*0193

*Each year, either they win the World Series which you would consider a success, if you are a fan of the Yankees, *0196

*or you consider it a failure if they do not win.*0202

*All of these different situations can essentially be described by the same mathematical process which is the negative binomial distribution.*0205

*Let us go ahead and look at the formulas for that.*0218

*There are several fixed parameters before you start.*0221

*P is the probability of success on each trial.*0224

*For flipping a coin and it is a fair coin, it is not loaded then P would be ½ *0227

*because that is your probability of getting a head each time you flip a coin.*0233

*If you are rolling a dice and you are trying to get a 6 then P would been 1/6.*0237

*If you are watching the New York Yankees try to win the World Series, I do not know what the exact probability is,*0242

*but let us say it is 1/10 because on average, every 10 years they win the World Series.*0249

*Q is the probability of failure.*0255

*You do not really need to know that a head of time because Q is always equal to 1 - P.*0258

*You can always just fill in 1 - P for Q.*0263

*R is the number of successes that you want to see and that is the number that you have to decide on ahead of time.*0267

*In the probability class, that is a number that you should figure out from the problem that you have been given somehow.*0275

*You decide ahead of time, when I'm flipping a coin, I want to see 3 heads.*0282

*If I'm keeping track of the Yankees winning the World Series,*0289

*I want to know how long will it be until they have 5 more World Series championships.*0292

*The random variable that they are watching for the negative binomial distribution is the number of trials,*0298

*in order to get R successes.*0304

*We are going to run this experiment until we get R successes and when we get that last one,*0307

*we stop and we count how many trials it took to take to get those successes.*0313

*Finally, we have the formula for the probability distribution.*0320

*It is Y -1 choose R -1, that is not a fraction that is the binomial coefficient.*0323

*That is the formula for combinations, let me write it out here.*0330

*Y -1!/R -1! and then Y -1 - R -1 which would be Y - R!.*0334

*That is what that coefficient means.*0348

*P ⁺R and Q ⁺Y – R, notice I changed the possible range of values Y can take.*0350

*That is because, if you are looking for R successes, you know it is going to take at least R trials.*0361

*That is why instead of saying Y greater than or equal to 1, we change that to R.*0367

*If you are flipping a coin until you get 5 heads, you know at least it is going to take 5 flip.*0372

*There is no point in even asking, what is the probability of it happening in 4 flips?*0377

*If you want the Yankees to win the World Series 15 ×, there is no point in asking*0384

*whether that is going to happen in the next 10 years because it cannot.*0391

*It can take them at least 15 years to win 15 more World Series.*0395

*That is the probability distribution and there is some more quantities associated with this that we need to know.*0400

*By the way, I want to continue to highlight the fact that there are two different P's in this formula and*0409

*they are representing different things.*0417

*That is kind of unfortunate.*0419

*There is that P and then there is that P.*0422

*This P right here is the probability of Y trials overall, in order to get a certain number of successes.*0425

*This P right here, represents the probability of success on each trial.*0442

*Make sure you do not mix up those two P’s because that is a good way to fail a probability class.*0448

*Let us keep moving, let us learn some key properties of the negative binomial distribution.*0461

*There is the mean which is the expected value.*0467

*Remember, mean and expected value always are synonymous with each other, those are the same thing.*0471

*Expected value is just R/P.*0477

*The variance of the negative binomial distribution is RQ/P².*0483

*The standard deviation is always the square root of the variance.*0490

*Usually, the way you calculate it is by calculating the variance first and then taking the square root.*0495

*Once you have the variance, you just take the square root of that to get the standard deviation, √ RQ/P.*0501

*Let us practice using the negative binomial distribution.*0510

*In our first example, we are going to draw cards from a deck and we want to see 4 aces.*0515

*A question you should always ask with this kind of selection problem is whether there is replacement or not replacement.*0523

*Meaning, after I draw a card out of the deck, do I put it back in the deck or *0530

*do I just hang onto it and then draw a different card the next time.*0534

*In this case, we are being given that we are replacing cards back into the deck.*0538

*We are going to draw a card, put it back, draw a card, put it back.*0545

*The question is how long it will take to get exactly 4 aces?*0549

*In particular, what is the chance that it will take exactly 20 draws, in order to get 4 aces?*0556

*This is a negative binomial distribution formula, negative binomial distribution.*0563

*Let me identify the parameters that we are dealing with here.*0570

*The probability of getting an ace on any given draw, there are 4 aces in there out of 52 possible cards, that is just 1/13.*0573

*Q is always 1- P, that is 1 -1/13 is 12/13.*0583

*The R in this case is the number of aces that we want to get.*0591

*We want to get 4 aces here, we are going to stop the experiment after we find that 4th ace.*0596

*The value of Y that we are interested in, Y is the number of × they were going to have to draw.*0603

*We are interested in the probability that we will draw exactly 20 ×.*0611

*Let me remind you of the distribution formula.*0616

*P of Y is equal to Y -1 choose R – 1, this is the negative binomial distribution formula that I gave you a couple slides ago.*0619

*P ⁺R Q ⁺Y - R.*0629

*I’m going to fill in all those values that I recorded above.*0632

*Y -1 is 19, R -1 is 4 -1 is 3, P is 1/13.*0634

*1/13 ⁺R is 4 and Q is 12/13 ⁺Y - R, 20 - 4 is 16.*0643

*This simplifies a bit but it is not going to get much nicer.*0655

*19 choose 3, 1/13 ⁺R there but my R was 4.*0658

*It looks like I’m going to have 13 ⁺20 in the denominator.*0669

*In the numerator, I’m going to have 12 ⁺16.*0674

*I did not bother to find the decimal for that, it would be a very small number because it is not very likely*0679

*that you will draw exactly 20 ×, that you will get your 4th ace on exactly the 20th draw.*0686

*I would just leave the answer in that form and present that as my answer.*0695

*Let me show you the steps involved there.*0700

*First, you realize this is a negative binomial distribution problem *0703

*because we are running trials over and over again until we get a certain fixed number of successes.*0708

*We do not know how many trials we are going to run ahead of time, that is Y it is not a binomial distribution.*0715

*But we do know how many successes we are going to have.*0720

*We want to get 4 aces, that is where my R = 4 comes from, that 4 right there.*0724

*The probability of getting an ace is 4 out of 52.*0730

*Because there are 4 aces in a deck out of 52 cards and that is 1/13.*0733

*Q is always 1 – P.*0738

*Since, we are interested in drawing exactly 20 ×, I'm going to use Y = 20.*0740

*This is my negative binomial distribution formula and I will just drop all the numbers in there and simplify down to not very pleasant fraction there.*0745

*In example 2, we got the Akron Arvarks are going out for chinchilla grooming championship.*0760

*Apparently, each year they have a 10% chance of winning the championship.*0768

*They are being rather optimistic at home, they built themselves a trophy case with space for 5 championship trophies.*0774

*We want to know how long we will have to wait until that trophy case is completely full.*0782

*In particular, we want to find the mean and the standard deviation of Y there.*0789

*Once again, this is a negative binomial distribution problem because each year they are going to go out *0795

*and they are either going to win a trophy or they would not win a trophy.*0802

*We are interested in how long it will take to get 5 trophies?*0807

*We have to win 5 × to fill their trophy case.*0812

*Let me identify the parameters here for the negative binomial distribution.*0819

*The probability that they will win on any given year is 10%, that is 1/10.*0824

*That means the Q is always the probability of failure 1- P is 9/10.*0829

*The number of × that they want to win, that R, that is 5.*0836

*We want to find the mean and the standard deviation of the time for them to win.*0841

*Let me remind you of the formula for the mean.*0850

*We had this on one of the earlier slides.*0853

*If you check back a couple slides ago, you will see this.*0856

*It was R/P, I will go ahead and write down the formula for the variance.*0859

*V of Y is RQ/P².*0865

*In this case, our R is 5, our P is 1/10, 5 divided by 1/10, flip on the denominator is 50.*0871

*That is the expected time to fill up their trophy case.*0882

*That, by the way, is a very intuitive answer.*0886

*It definitely conforms to your intuition which is that while on average, they have a 1/10 chance of winning each year.*0890

*On average, they are going to win about once every 10 years.*0898

*If they want to win 5 ×, we expect it to take about 50 years for them to bring home 5 trophies.*0903

*Let us keep going with the variance here.*0909

*RQ is 5 × 9/10, P² is 1/100.*0912

*If I flip, I get 100 × 45/10 which simplifies down to 450, that was the variance, that is not the standard deviation.*0924

*The way you get the standard deviation is you calculate the variance first and then you take it square root.*0936

*Let me go ahead and label what I’m doing in each step here.*0943

*That was the mean, this is the variance, and now I'm about to calculate the standard deviation.*0947

*The standard deviation is always the square root of the variance, √ 450.*0960

*That simplifies a little bit, I can take a 9 out of there right away.*0968

*When it comes outside, it will be a 3, 450 is 9 × 50.*0973

*I can take a factor 25 out of 50, pull out the square root, I’m going of another 5 outside the square root.*0977

*15 × √ 2.*0984

*That does not simplify anymore, but I did throw that into my calculator,*0989

*and got a decimal approximation, it was 21.21 years.*0995

* What that tells me is if Akron arvarks have built this lovely new trophy case with space for 5 trophies *1009

*and we want to know how long you are going to have to wait until their trophy case is full,*1017

*on average they are going to have to wait about 50 years.*1023

*The standard deviation on that estimate is 21.21 years.*1025

*Let me go back over those steps.*1033

*This is a negative binomial problem, we want to identify the probability of winning on any given year.*1035

*That is 1/10, that comes from the 10% right here.*1042

*That means the probability of losing the Q is 9/10 and we want to win 5 years total.*1047

*That is why we put in our R is equal to 5 there.*1058

*I recalled the formulas for the mean and variance.*1061

*I got these back off on the slides just a little bit earlier in the lecture.*1065

*Just flip back a couple of slides in the lecture and you will see these formulas for the mean and the variance.*1070

*We just drop in the numbers for R, Q, and P.*1075

*We get the mean of 50 years.*1078

*Q or the variance gives me 450 and that is not the answer, that is not the standard deviation.*1081

*To get the standard deviation, you take the square root of the variance.*1088

*√ 450 simplifies down to 21.21 years, that is the standard deviation of waiting time for their trophy case to be full.*1093

*In example 3 here, we are going to roll a dice until we get four 6.*1109

*I just keep rolling and then I just keep track of the number of times I have seen a 6.*1114

*I want to find the mean and the standard deviation of the number of rolls we will make.*1120

*Let us first recognize this as a negative binomial distribution because every time we roll a dice,*1126

*either we get a 6 and we call that a success or we do not get a 6 and we call that a failure.*1136

*Roll, roll, until we get a 6 that is a success, do not get a 6 it is a failure.*1141

*We want to get four 6 total, we want to get 4 success.*1146

*Let me record the parameters of this distribution.*1152

*Our probability of getting a success, our probability of rolling a 6 when we roll a dice is 1/6.*1158

*Q is always 1- P, that is the probability of failure is 5/6.*1164

*R is the number of successes we want to get here, when we get four 6, that is 4.*1172

*Our expected value of Y is always, this is the mean, that is always R/P.*1179

*I told you that on the third slide of this lecture, the same time *1188

*that I told you the variance of the negative binomial distribution is V of Y is RQ/P².*1193

*Let me go ahead and fill in the numbers here.*1205

*R/P is 4 divided by 1/6, do the flip there that is 4 × 6 is 24, that was our mean.*1206

*That tells us on average, it will take us 24 rolls.*1218

*Not at all surprising there because on average, we are going to roll a 6 once every 6 rolls.*1221

*If I want to get four 6, it will take 24 rolls.*1229

*The variance is not intuitively obvious. *1232

*R is 4, Q is 5/6, let us multiply it by 5/6, that is not a mixed number 4 and 5/6.*1236

*4 × 5/6 and P² is 1/6², 1/36.*1247

*If I do the flip there, I will get 36 × 4 × 5/6.*1255

*I can simplify that 36 into a 6, that 6 goes away.*1264

*6 × 4 × 5 is 6 × 4 is 24 × 5 is 120.*1269

*That is variance, not standard deviation.*1279

*To get standard deviation, you just take the square root of the variance, that is always true.*1282

*You usually compute the variance first and then just take its square root, √ 120.*1289

*120 has factor of 4, I can pull 2 out of there.*1298

*It leaves me with 30 under the square root and that does not really simplify anymore *1303

*but I did put that in my calculator, before I started this.*1309

*What my calculator told me was 2 √30 is approximately 10.95.*1313

*My units there are the number of rolls that I'm going to have to do, in order to see four 6.*1320

*That wraps up that one, let me recap the steps there.*1331

*We identify that as a negative binomial distribution because it is a process we are repeating over and over, *1335

*until we get success a certain number of times.*1342

*In this case, success is defined as rolling a 6 and we want to get four 6, 4 successes.*1345

*I have identified my parameters there, P is the probability of rolling a 6 on any given roll, that is 1/6.*1355

*Q is always 1- P and R is the number of success that we are looking for, that is 4, we were told.*1362

*That is why I got R = 4.*1371

*The mean of the negative binomial distribution and the variance of the negative binomial distribution, *1375

*those are formulas that I gave you on one of the earlier slides, in terms of R and P, and Q.*1382

*The mean is R/P, variance is RQ/P².*1388

*I just have to drop in the numbers that I had already written down above.*1393

*The P is 1/6, Q is 5/6, R is 4.*1397

*Drop those in and simplify them down.*1402

*I got a mean of 24 rolls which is very intuitive because if you get a 6 every 6 rolls on average,*1404

*and you want to get four 6, it is going taking 24 rolls on average.*1413

*The variance is less intuitive, we just kind of follow the formula and these number simplify down to 120.*1418

*To get the standard deviation, you always take the square root of the variance and *1429

*that simplifies down to an approximation of 10.95 rolls.*1433

*In example 4 here, we got a company which is interviewing applicants for a job.*1442

*Apparently, the company has 3 positions to fill.*1451

*Perhaps, they are looking for 3 programmers, they are all identical positions.*1454

*They start interviewing people and it turns out that exactly 10% of all the possible applicants *1461

*actually have the qualifications for the job.*1468

*They actually know the right programming languages and they have the other skills necessary to do the job.*1471

*Every time we interview somebody, there is a 10% chance that they will be good enough and we will hire them.*1480

*We are going to keep interviewing and interviewing and interviewing one person at a time until we get 3 good people.*1488

*I will hang onto them and at that point, we will close the door, everybody else has to wait.*1494

*There are two questions here.*1499

*What is the probability that they will interview exactly 10 applicants and then the probability *1500

*that they will interview at least 10 applicants?*1506

*Let us identify this as a negative binomial distribution.*1510

*We are doing trials, each one has success or failure, meaning we talk to an applicant.*1516

*If they have the skills, we hire them.*1524

*If not, we show them the door.*1526

*Because we are looking for 3 successes, we are looking for 3 good people right now.*1529

*That kind of plays into the definition of negative binomial distribution, when you are looking for a fixed number of successes *1537

*and you will just keep interviewing and keep looking and looking and looking, keep running trials until you get those 3 successes.*1544

*Let me identify the parameters here.*1552

*First of all, the probability of getting a success on any given trial is 1/10, that is because 10% of the applicants have the skills.*1554

*Q is always 1- P, in this case 1 -1/10 is 9/10.*1562

*R is the number of successes you are looking for.*1570

*In this case, we are looking for 3 people.*1572

*For part A, we are trying to interview exactly 10 applicants.*1577

*We want to use Y is equal to 10, let me remind you of the formula for the negative binomial distribution.*1589

*It is always Y -1 choose R -1 × P ⁺R × Q ⁺Y – R.*1598

*Let me fill in the numbers because I think I have identified all the values of those numbers there.*1610

*Y was 10 in this case, let me go ahead and say P of 10 is Y -1 is 9, R -1 is 3 -1 is 2.*1618

*P is 1/10, 1/10 ⁺R is 3 and Q is 9/10, Y - R is 10 -3 is 7.*1633

*This simplifies a little bit, it is not great but 9 choose 2, I can simplify that as 9 × 8 divided by 2.*1649

*That is because there is a 7! top and bottom that get canceled there.*1660

*That is 9 ×, let us se, 9 × 4 is 36 and this is 36.*1664

*It looks like I have a 9⁷ in the numerator and a 10 ⁺10 in the denominator.*1677

*And that does not really seem like it is going to get any better.*1684

*I’m just going to leave that the way it is.*1688

*You could find a decimal for that, I did not bother to put that into our calculator*1691

*and convert it to decimal because it was not a very revealing answer that I got.*1695

*It would be a fairly small decimal.*1702

*You do not expect to interview exactly 10 people and get lucky enough to get 3 good people in those first 10.*1705

*That is not very likely.*1712

*It looks like I run out of space to answer part B.*1714

*I’m going to jump over to the next slide to answer part B.*1717

*Let me recap how we answered part A before I sit for the next slide.*1721

*It is a negative binomial distribution here, we want to keep interviewing people until we get exactly 3 successes.*1729

*Our probability is 1/10, that is coming from this 10% chance of succeeding on any given applicant.*1737

*Q is 9/10, it is always 1 – P.*1745

*R = 3, that comes from 3 positions to fill.*1748

*That is why we are looking for R is equal to 3 here.*1755

*And then for part A, we want to interview exactly 10 applicants, that is why we are using Y is equal to 10.*1758

*I just dropped the Y, R, P, Q, into my generic formula for the negative binomial distribution.*1765

*I got 9 choose 2 × 1/10³, 9/10⁷.*1774

*I simplified that a little bit but it did not really simplify into anything very nice.*1779

*For part B, let me go ahead and jump over to the next slide and we will do that.*1788

*This is still example 4, we are still trying to interview applicants to get 3 qualified people for these 3 job openings that we have.*1795

*The question is what is the probability of getting at least 10 applicants?*1804

*The way you want to think about that, let us think about why you would interview 10 applicants?*1809

*That really means that, since you are looking for 3 people, you failed to get 3 people in the first 9 applicants.*1815

*Let me write the answer to part B as the probability that we do not get 3 qualified applicants in the first 9 people we interview.*1824

*That is the way to think about that.*1846

*If you think about that, that means we kind of looking at those first 9 people and*1849

*saying what is the chance that among those 9 people, we have fewer than 3 qualified applicants.*1854

*What that really is, is among those first 9 people, in the first 9 applicants, how many winners are we looking at?*1865

*If we are not going to get 3 that means we got either 0 winners, or 1 winner, or 2 winners, in the first 9 applicants.*1882

*This is no longer a negative binomial distribution because now we are looking at a fixed number of people, 9 applicants.*1895

*What we are really doing here is, we are now using not the negative binomial distribution but the binomial distribution.*1905

*Binomial distribution not the negative binomial distribution *1918

*because we are looking at a fixed number of people and asking what the probability is of getting a certain number of successes.*1927

*Let me remind you of the formula for the binomial distribution.*1935

*It is different from the formula for the negative binomial distribution.*1939

*For the binomial distribution, P of Y is equal to N choose Y × P ⁺Y × Q ⁺N – Y.*1942

*If you do not remember that, just check back two videos here on www.educator.com.*1955

*I think it was two videos ago where I had a lecture on the binomial distribution.*1960

*You can work through the video on the binomial distribution and then you will be ready to tackle the rest of this problem.*1967

*In this case, we are talking about 9 applicants.*1975

*Our N is the number of trials, N is 9.*1977

*The probability of getting a winner on any particular applicant is still 1/10, Q is still 1- P is 9/10.*1982

*The Y is the number of applicants we are hoping to get.*1995

*This is Y equal to 0, this is Y = 1, Y =2.*2000

*Let me fill in each one of those according to the formula.*2005

*That is 9 choose 0 × P is 1/10, 1/10⁰ × 9/10⁹ - 0 + 9 choose 1 × 1/10¹ *2008

*× 9/10⁸ + 9 choose 2 × 1/10² × 9/10⁹-2 which is 7.*2031

*These numbers actually do combine in a fairly pleasant way.*2051

*I have worked out the fractions ahead of time and they were pretty nice.*2055

*Let me go ahead and then play with this a little bit.*2058

*9 choose 0, you can work that out from the formula but it is also the number of ways of choosing 0 things,*2061

*and one way to choose 0 things which is just to take the empty set.*2068

*That is 1 × 9⁹/9/10 + 9 choose 1, you can use the formula but that is number of ways*2072

*to choose the one thing out of 9 and there is definitely 9 ways to do that.*2085

*9 × 9⁸ /, it now looks like there is going to be a 10⁹,*2089

*On the first one, I accidentally wrote 9 ⁺10.*2097

*What I meant was 10⁹, be careful about that.*2101

*Let me go back and look at 9 choose 2.*2106

*9 choose 2, we work this out before, it is 9 × 8/2 which simplifies down to 36.*2111

*36 × 9⁷/10⁹ in the denominator.*2121

*This simplifies a bit, we can out a 10⁹ as a common denominator.*2130

*It looks like all of these will have a factor of 9⁷.*2137

*On the first one, it is 9⁹ so there are two 9 left, 81.*2145

*81 + on the second one there is 9 × 9⁸, we pull out 9⁷ and that will be 81.*2150

*There is a 36 here and I was trying to be more clever after that and it did not really work out.*2158

*There was nothing good that happens.*2164

*I just threw those numbers into my calculator and I got kind of a huge number here.*2167

*I will just copy it down, 473513931/5 × 10⁸.*2171

*What happened was there was a factor of 2 that cancel out a 10⁹ there.*2184

*That is not a very revealing number but I wrote it down as a decimal.*2189

*What I got was 0.947, 94.7%.*2197

*That is the chance that we will interview at least 10 applicants and that is really not very surprising.*2203

*You expect that chance to be pretty high.*2210

*Remember, what is going on here is we are interviewing applicants until we get 3 applicants that have the necessary skills.*2214

*On average, 1 in 10 people have the necessary skills.*2223

*We are interviewing people and the question is, what is the chance that we have to interview at least 10 people to find 3 winners?*2228

*It is pretty highly likely that we will have to interview at least 10 people.*2239

*It is not very likely that you will get 3 winners out of the first 9 applicants, if we are only looking for 1 in 10, in general.*2245

*I guess it is about a 5% chance that you will get all your winners in the first 9 applicants.*2254

*There is almost 95% chance that it will take you interviewing at least 10 people or more.*2260

*Let me recap the steps here.*2267

*The way to think about this is to realize that interviewing at least 10 applicants means that if you have to talk to 10 people,*2271

*that means the first 9 people did not give you enough good ones.*2283

*You do not get 3 good ones out of the first 9.*2286

*If you think about that, that is really asking what if I interview 9 people, what is the chance of not getting 3 winners?*2291

*That is a fixed number of trials because there is 9 trials and we want to get fewer than 3 winners.*2299

*We are interested in getting certain of number winners.*2305

*That is binomial distribution not a negative binomial distribution anymore.*2308

*Negative binomial is open ended where you just keep interviewing and interviewing,*2313

*until you get a certain number of winners.*2317

*Here, we are asking about 9 interviews total, what is the chance I’m not getting 3 winners?*2320

*If you are not going to get 3 winners that means you got 0, or 1, or 2.*2326

*We just want to add up those 3 probabilities and we are going to use the formula for the binomial distribution,*2332

*not the negative binomial distribution.*2338

*This is the binomial distribution here and I just took this formula, and I plugged in the different values of Y, P, and Q, into this formula.*2340

*Then I simplified down the fractions.*2351

*They started out to combine nicely and gave this lovely number so I just converted it into a decimal.*2355

*That is the chance that you will end up having to talk to at least 10 people, and it is quite likely. *2362

*If you are interviewing for 3 jobs, you want to fill all 3 jobs and only 1 in 10 people are good enough.*2368

*The chances are you are going to talk to at least 10 people, probably significantly more.*2375

*There is almost a 95% chance that you will talk to 10 people.*2379

*This example, we are going to keep the basic premise of this example for the next problem.*2387

*You want to hang on to the scenario here, this business about the company interviewing to fill 3 jobs.*2392

*We are going to hang onto that, we use the same numbers and then*2401

*we are going to introduce the wrinkle of how long each interval takes, each interview takes, in the next problem.*2404

*In example 5 here, we are going to be referring back to example 4.*2412

*If you have not just worked through example 4, what I want you to do is go back and read over the scenario from example 4,*2416

*because we need that to understand example 5.*2423

*What this company is doing is they are interviewing applicants for a job and they have 3 openings.*2428

*They want to keep interviewing until they get 3 people worthy of their openings.*2435

*Now, they are telling us that it takes them 3 hours to interview an unqualified applicant, 5 hours to interview a qualified applicant.*2442

*Remember, we are going to keep interviewing until we get 3 total qualified applicants.*2451

*Let us think about how long that will take and try to set up a formula for that.*2458

*I’m going to set up, variable T is the time to find 3 qualified applicants.*2463

*Let us think about how that would break down.*2477

*If you are going to find 3 qualified applicants, remember Y is the number of applicants overall.*2481

*The number of applicants that we are going to talk to overall.*2491

*Some of them are qualified and some of them are not.*2495

*Y is the total number of applicants.*2499

*We are going to keep interviewing until we find 3 good ones.*2502

*What that means is if we talk to Y people total and there are 3 good ones, then there are Y -3 bad ones.*2506

*Each one of those bad people, each one of unqualified bombs is going to take us 3 hours to talk to.*2515

*3 × Y -3, that is how much time we spent talking to people who are unqualified.*2522

*There is also 3 good people and each one of those is going to cost us 5 hours to check them out,*2529

*run the background check, and really confirm that those are good people for our job.*2535

*Let me as simplify this expression.*2542

*I get 3 Y - 9 + 15 which is 3 Y + 6.*2544

*What I really want to do is calculate the mean and standard deviation of T, of 3 Y + 6.*2552

*Let me calculate first the mean and standard deviation of Y itself.*2558

*To do that, I'm going through the variance because that for me is the easy one to calculate.*2570

*First, I will calculate the mean of Y, the expected value of Y.*2577

*Remember, the mean and expected value are the same thing.*2582

*The formula we have for that is R/P, that is one of our earlier formulas.*2586

*I think it is on the third slide of this lecture.*2590

*That is for the negative binomial distribution.*2593

*The variance V of Y is RQ/P².*2596

*We already have our values for R, P, and Q.*2604

*Let me remind you what they are here.*2607

*The P was the probability that any given applicant has the skills, that is 10%, that is 1/10.*2611

*Q is always 1- P, that is 9/10.*2618

*R is the number of successes that we are looking for.*2625

*In this case, we have 3 job openings, I got that by the way from example 4, the fact that we have 3 job openings.*2629

*R is 3 in this case, let me go ahead and calculate this mean and variance using those numbers.*2636

*3 divided by 1/10 is 3 × 10, that is 30.*2644

*R here is 3, Q is 9/10, P² is 1/10², 1/100 so that is 100 × 27/10.*2652

*I’m doing to flip /100 that is 10 × 27 is 270.*2667

*270, but that is the variance and the mean of Y not of T.*2673

*In order to find the mean and variance of T, we have to remember a couple of rules of probability here.*2678

*Let me remind you what those were.*2687

*The expected value of AY + B, expectation is linear, it is A × E of Y + B.*2688

*Variance is not linear, variance of AY + B.*2700

*The interesting thing is that the B does not affect it at all.*2704

*Essentially, variance measures how much a variable wobbles.*2707

*If you take a variable and just move everything over, that does not change how much it wobbles.*2711

*There is no N in the answer, it is A² × V of Y.*2718

*We are going to use those two values to help us calculate the mean and the variance of T.*2724

*E of T, T was 3 Y + 6, work that out up above there.*2732

*E of 3 Y + 6 which using my little formula over there is, the A is 3.*2740

*That is A, that is B, it is 3 E of Y + 6.*2750

*The E of Y was 30, that is 3 × 30 + 6 which simplifies down to 96.*2759

*Ours units here are hours, we get 96 hours is the expected time that *2766

*it will take this company to interview all these people and get 3 qualified applicants.*2777

*The variance, which is not really what we are looking for, we are looking for the standard deviation but it is very useful to find the variance*2784

*because I can just take the square root of that to find standard deviation.*2794

*The variance of 3 Y + 6, are I’m going to use my formula for A² V of Y, that is 9 × V of Y which is 9 × 270.*2798

*I’m going to leave that factored because the next thing I'm going to do is take its square root.*2813

*It would be easier to do if it is factored.*2819

*Let me find the standard deviation now.*2821

*The standard deviation of that is always the square root of the variance, V of T.*2826

*√ 9 × 270, that is why I left it factored is I can pull out a 3 √ 270.*2836

*I know I can pull out another 3 because there is still a factor of 9 over there.*2844

*3 × 3 × √30.*2850

*9 √ 30, that is not going to get any better until I pull out a calculator.*2853

*I did throw that into my calculator and got 49.295 hours.*2864

*That was of course an approximation, that is my standard deviation and *2875

*the time that the company should budget to conduct all these interviews.*2882

*That answers both of the questions that we repost there.*2890

*Let me show you where those all came from.*2894

*First of all, the basic parameters of this problem came from example 4.*2897

*If you are a little mystified as to where these numbers up here came from, just go back and look at example 4.*2900

*The premise of that problem was that 10% of the people we are interviewing are actually qualified.*2907

*We have a 10% chance of success every time we invite someone in to the office.*2913

*That means we have a 9/10 chance of failure.*2920

*9/10 of the people do not have the right skills for this particular job.*2923

*We have 3 job openings, that is where the R = 3 comes from.*2926

*The tricky part here is to set up an expression for the time to find the 3 qualified people that we are looking for.*2932

*If Y is the total number of people we interview then that means 3 of them are qualified.*2943

*3 of them are going to get that full 5 hour interview, that is where that 3 came from.*2949

*All the rest of the people are unqualified, that is Y -3 people are left over and*2955

*they are going to get the shorter interview, the 3 hour interview, before we realize that they are unqualified.*2961

*If you just simplify the arithmetic there, it simplifies down to 3 Y + 6.*2968

*We are going to have to find the expected value and the standard deviation of that.*2974

*I dredged up a couple of old and very useful rules on expectation and variance.*2980

*The expected value of AY + B, since expectation is linear, it is just A × expected value of Y + B.*2986

*With variance, it is not linear.*2994

*What happens is the B disappears and you get A² coming out.*2995

*The mean and the variance of Y by themselves, we will need those as a steppingstone to find the mean and variance of the time.*3002

*These are formulas that I just got off the third slide in this video.*3011

*Just check that, you will see that R/P, that RQ/P².*3017

*And I'm just dropping the values of P, Q, R to get the 30 and the 270 here.*3021

*I have to find the mean and variance of T.*3028

*Remember, T is 3 Y + 6, that is where I use my old linearity formula.*3032

*I drop in 3 × expected value Y + 6, that 30 /come from here and that is how I get 96 hours.*3038

*This company should plan to spend about 96 hours, if they intend to get 3 good people to fill their 3 jobs.*3049

*The variance there, this 3² is 9, that is using this formula right here.*3057

*The 6 just plays no role at all in there because it is like this B, that just disappears.*3063

*We get 9 × 270.*3069

*The standard deviation is always the square root of the variance.*3072

*I do √9 × 270 and that reduced down to a decimal approximation of 49.295 hours.*3075

*That wraps up this lecture on negative binomial distribution.*3087

*This is part of the probability series here on www.educator.com.*3091

*My name is Will Murray, thank you very much for joining us, bye.*3096

3 answers

Last reply by: Dr. William Murray

Mon Mar 9, 2015 9:26 PM

Post by Anhtuan Tran on February 26, 2015

Hi Dr Murray,

I have another question on example 4, part b.

Let X be the numbers of the interviews.

We're trying to calculate P(X>= 10). P(X>=10) also means that we don't get 3 qualified applicants in the first 9 interview and then it becomes a simple binomial distribution problem. I totally agree with you about that.

But here is my other approach. Assume that I want to find P(X <= 9) and then my P(X>=10) = 1 - P(X<=9).

P(X<=9) means that as long as I get the 3 qualified applicants in the first 9 interviews. Therefore, P(X<=9) = 9C3 . p^3 . q^6. And then I just subtract it from 1.

However, I didn't get the same answer. I have a feeling that something is wrong with P(X<=9), but I couldn't figure out the reason why it didn't work.

Thank you.

3 answers

Last reply by: Dr. William Murray

Fri Feb 27, 2015 1:19 PM

Post by Anhtuan Tran on February 21, 2015

Hi Professor Murray,

I was trying to figure out what kind of distribution of this problem was, but I couldn't. Here is the problem: There are 2n balls with n different colors. Each color has 2 balls. Draw balls until getting a ball with the color that has appeared before. Find the distribution.

There are two cases:

i/ drawing with replacement

ii/ drawing without replacement.

How do you approach this problem?

Thank you.

4 answers

Last reply by: Dr. William Murray

Mon Jan 12, 2015 10:24 AM

Post by Anton Sie on December 26, 2014

Hi, I have a question about example I: how do you calculate the same chance, but without replacement?