Sign In | Subscribe

Enter your Sign on user name and password.

Forgot password?
  • Follow us on:
Start learning today, and be successful in your academic & professional career. Start Today!
Loading video...
This is a quick preview of the lesson. For full access, please Log In or Sign up.
For more information, please see full course syllabus of Probability
  • Discussion

  • Study Guides

  • Download Lecture Slides

  • Table of Contents

  • Transcription

Bookmark and Share
Lecture Comments (13)

3 answers

Last reply by: Dr. William Murray
Mon Mar 9, 2015 9:26 PM

Post by Anhtuan Tran on February 26, 2015

Hi Dr Murray,
I have another question on example 4, part b.
Let X be the numbers of the interviews.
We're trying to calculate P(X>= 10). P(X>=10) also means that we don't get 3 qualified applicants in the first 9 interview and then it becomes a simple binomial distribution problem. I totally agree with you about that.

But here is my other approach. Assume that I want to find P(X <= 9) and then my P(X>=10) = 1 - P(X<=9).
P(X<=9) means that as long as I get the 3 qualified applicants in the first 9 interviews. Therefore, P(X<=9) = 9C3 . p^3 . q^6. And then I just subtract it from 1.
However, I didn't get the same answer. I have a feeling that something is wrong with P(X<=9), but I couldn't figure out the reason why it didn't work.

Thank you.

3 answers

Last reply by: Dr. William Murray
Fri Feb 27, 2015 1:19 PM

Post by Anhtuan Tran on February 21, 2015

Hi Professor Murray,
I was trying to figure out what kind of distribution of this problem was, but I couldn't. Here is the problem: There are 2n balls with n different colors. Each color has 2 balls. Draw balls until getting a ball with the color that has appeared before. Find the distribution.
There are two cases:
i/ drawing with replacement
ii/ drawing without replacement.
How do you approach this problem?
Thank you.

4 answers

Last reply by: Dr. William Murray
Mon Jan 12, 2015 10:24 AM

Post by Anton Sie on December 26, 2014

Hi, I have a question about example I: how do you calculate the same chance, but without replacement?

Negative Binomial Distribution

Download Quick Notes

Negative Binomial Distribution

Lecture Slides are screen-captured images of important points in the lecture. Students can download and print out these lecture slide images to do practice problems as well as take notes while watching the lecture.

  • Intro 0:00
  • Negative Binomial Distribution 0:11
    • Negative Binomial Distribution: Definition
    • Prototypical Example: Flipping a Coin Until We Get r Successes
    • Negative Binomial Distribution vs. Binomial Distribution
    • Negative Binomial Distribution vs. Geometric Distribution
  • Formula for Negative Binomial Distribution 3:39
    • Fixed Parameters
    • Random Variable
    • Formula for Negative Binomial Distribution
  • Key Properties of Negative Binomial 7:44
    • Mean
    • Variance
    • Standard Deviation
  • Example I: Drawing Cards from a Deck (With Replacement) Until You Get Four Aces 8:32
    • Example I: Question & Solution
  • Example II: Chinchilla Grooming 12:37
    • Example II: Mean
    • Example II: Variance
    • Example II: Standard Deviation
    • Example II: Summary
  • Example III: Rolling a Die Until You Get Four Sixes 18:27
    • Example III: Setting Up
    • Example III: Mean
    • Example III: Variance
    • Example III: Standard Deviation
  • Example IV: Job Applicants 24:00
    • Example IV: Setting Up
    • Example IV: Part A
    • Example IV: Part B
  • Example V: Mean & Standard Deviation of Time to Conduct All the Interviews 40:10
    • Example V: Setting Up
    • Example V: Mean
    • Example V: Variance
    • Example V: Standard Deviation
    • Example V: Summary

Transcription: Negative Binomial Distribution

Hi and welcome back to the probability lectures here on

Today, we are going to talk about the negative binomial distribution.0005

My name is Will Murray, let us jump right on in.0009

The negative binomial distribution describes a sequence of trials, each of which can have two outcomes, success or failure.0013

You want to think about this as like flipping a coin.0020

It is very much like the geometric distribution except that we are going to keep flipping a coin,0023

we are going to keep running these trials until we get R successes.0030

R is some predetermined constant positive number.0034

That R is a constant that we have decided in advance.0039

For example, we might say I'm going to keep flipping this coin until I have seen heads 5 ×.0047

I keep flipping a coin and I got tails, heads, tails, tails, heads, tails, tails, heads, heads, heads.0054

I stopped as soon as I see that 5th head.0061

It is different from the binomial distribution.0065

You know it sounds like the binomial distribution.0068

The binomial distribution, we decide a head of time how many times we are going to flip a coin0072

and then we keep track of the number of heads after it is all over.0078

With the negative binomial distribution, we decide ahead of time0082

how many heads we want to see and we flip for as long as it takes to get that number of heads.0087

It is actually the negative binomial distribution is actually more similar to the geometric distribution.0094

If you take R = 1 that means we are going to keep flipping a coin until we see the first head0100

and that is exactly the geometric distribution.0106

In some sense, the negative binomial distribution is a generalization of the geometric distribution.0109

By the way, if you have not watched the lectures on binomial and geometric distribution, you probably one want to watch those first.0116

The geometric distribution is the one just before this video.0122

Just click back, watch the lecture on the geometric distribution.0127

Once you understand that very well, it will be time to come back and look at the binomial distribution today.0130

I have been talking about this in terms of flipping a coin.0139

It does not have to be a coin flip, it can be any kind of situation where you have a sort of binary outcome either you have a success or failure.0143

For example, you could be rolling a dice, let us say you are rolling a dice and trying to get a 6.0153

Let us say you want to get a certain number of 6’s when you roll this dice.0160

You keep rolling and rolling and rolling, each time you roll,0164

you either record it as 6 which would be a success, or as not a 6 which would count as a failure.0167

It does not have to be flipping a coin, it could be rolling a dice,0176

it could be watching your favorite team compete and try to win the World Series every year.0179

Maybe you like the New York Yankees and you want to see will the Yankees win the World Series this year,0187

will they win the World Series next year, will they win the World Series the year after that?0193

Each year, either they win the World Series which you would consider a success, if you are a fan of the Yankees,0196

or you consider it a failure if they do not win.0202

All of these different situations can essentially be described by the same mathematical process which is the negative binomial distribution.0205

Let us go ahead and look at the formulas for that.0218

There are several fixed parameters before you start.0221

P is the probability of success on each trial.0224

For flipping a coin and it is a fair coin, it is not loaded then P would be ½0227

because that is your probability of getting a head each time you flip a coin.0233

If you are rolling a dice and you are trying to get a 6 then P would been 1/6.0237

If you are watching the New York Yankees try to win the World Series, I do not know what the exact probability is,0242

but let us say it is 1/10 because on average, every 10 years they win the World Series.0249

Q is the probability of failure.0255

You do not really need to know that a head of time because Q is always equal to 1 - P.0258

You can always just fill in 1 - P for Q.0263

R is the number of successes that you want to see and that is the number that you have to decide on ahead of time.0267

In the probability class, that is a number that you should figure out from the problem that you have been given somehow.0275

You decide ahead of time, when I'm flipping a coin, I want to see 3 heads.0282

If I'm keeping track of the Yankees winning the World Series,0289

I want to know how long will it be until they have 5 more World Series championships.0292

The random variable that they are watching for the negative binomial distribution is the number of trials,0298

in order to get R successes.0304

We are going to run this experiment until we get R successes and when we get that last one,0307

we stop and we count how many trials it took to take to get those successes.0313

Finally, we have the formula for the probability distribution.0320

It is Y -1 choose R -1, that is not a fraction that is the binomial coefficient.0323

That is the formula for combinations, let me write it out here.0330

Y -1!/R -1! and then Y -1 - R -1 which would be Y - R!.0334

That is what that coefficient means.0348

P ⁺R and Q ⁺Y – R, notice I changed the possible range of values Y can take.0350

That is because, if you are looking for R successes, you know it is going to take at least R trials.0361

That is why instead of saying Y greater than or equal to 1, we change that to R.0367

If you are flipping a coin until you get 5 heads, you know at least it is going to take 5 flip.0372

There is no point in even asking, what is the probability of it happening in 4 flips?0377

If you want the Yankees to win the World Series 15 ×, there is no point in asking0384

whether that is going to happen in the next 10 years because it cannot.0391

It can take them at least 15 years to win 15 more World Series.0395

That is the probability distribution and there is some more quantities associated with this that we need to know.0400

By the way, I want to continue to highlight the fact that there are two different P's in this formula and0409

they are representing different things.0417

That is kind of unfortunate.0419

There is that P and then there is that P.0422

This P right here is the probability of Y trials overall, in order to get a certain number of successes.0425

This P right here, represents the probability of success on each trial.0442

Make sure you do not mix up those two P’s because that is a good way to fail a probability class.0448

Let us keep moving, let us learn some key properties of the negative binomial distribution.0461

There is the mean which is the expected value.0467

Remember, mean and expected value always are synonymous with each other, those are the same thing.0471

Expected value is just R/P.0477

The variance of the negative binomial distribution is RQ/P².0483

The standard deviation is always the square root of the variance.0490

Usually, the way you calculate it is by calculating the variance first and then taking the square root.0495

Once you have the variance, you just take the square root of that to get the standard deviation, √ RQ/P.0501

Let us practice using the negative binomial distribution.0510

In our first example, we are going to draw cards from a deck and we want to see 4 aces.0515

A question you should always ask with this kind of selection problem is whether there is replacement or not replacement.0523

Meaning, after I draw a card out of the deck, do I put it back in the deck or0530

do I just hang onto it and then draw a different card the next time.0534

In this case, we are being given that we are replacing cards back into the deck.0538

We are going to draw a card, put it back, draw a card, put it back.0545

The question is how long it will take to get exactly 4 aces?0549

In particular, what is the chance that it will take exactly 20 draws, in order to get 4 aces?0556

This is a negative binomial distribution formula, negative binomial distribution.0563

Let me identify the parameters that we are dealing with here.0570

The probability of getting an ace on any given draw, there are 4 aces in there out of 52 possible cards, that is just 1/13.0573

Q is always 1- P, that is 1 -1/13 is 12/13.0583

The R in this case is the number of aces that we want to get.0591

We want to get 4 aces here, we are going to stop the experiment after we find that 4th ace.0596

The value of Y that we are interested in, Y is the number of × they were going to have to draw.0603

We are interested in the probability that we will draw exactly 20 ×.0611

Let me remind you of the distribution formula.0616

P of Y is equal to Y -1 choose R – 1, this is the negative binomial distribution formula that I gave you a couple slides ago.0619

P ⁺R Q ⁺Y - R.0629

I’m going to fill in all those values that I recorded above.0632

Y -1 is 19, R -1 is 4 -1 is 3, P is 1/13.0634

1/13 ⁺R is 4 and Q is 12/13 ⁺Y - R, 20 - 4 is 16.0643

This simplifies a bit but it is not going to get much nicer.0655

19 choose 3, 1/13 ⁺R there but my R was 4.0658

It looks like I’m going to have 13 ⁺20 in the denominator.0669

In the numerator, I’m going to have 12 ⁺16.0674

I did not bother to find the decimal for that, it would be a very small number because it is not very likely0679

that you will draw exactly 20 ×, that you will get your 4th ace on exactly the 20th draw.0686

I would just leave the answer in that form and present that as my answer.0695

Let me show you the steps involved there.0700

First, you realize this is a negative binomial distribution problem0703

because we are running trials over and over again until we get a certain fixed number of successes.0708

We do not know how many trials we are going to run ahead of time, that is Y it is not a binomial distribution.0715

But we do know how many successes we are going to have.0720

We want to get 4 aces, that is where my R = 4 comes from, that 4 right there.0724

The probability of getting an ace is 4 out of 52.0730

Because there are 4 aces in a deck out of 52 cards and that is 1/13.0733

Q is always 1 – P.0738

Since, we are interested in drawing exactly 20 ×, I'm going to use Y = 20.0740

This is my negative binomial distribution formula and I will just drop all the numbers in there and simplify down to not very pleasant fraction there.0745

In example 2, we got the Akron Arvarks are going out for chinchilla grooming championship.0760

Apparently, each year they have a 10% chance of winning the championship.0768

They are being rather optimistic at home, they built themselves a trophy case with space for 5 championship trophies.0774

We want to know how long we will have to wait until that trophy case is completely full.0782

In particular, we want to find the mean and the standard deviation of Y there.0789

Once again, this is a negative binomial distribution problem because each year they are going to go out0795

and they are either going to win a trophy or they would not win a trophy.0802

We are interested in how long it will take to get 5 trophies?0807

We have to win 5 × to fill their trophy case.0812

Let me identify the parameters here for the negative binomial distribution.0819

The probability that they will win on any given year is 10%, that is 1/10.0824

That means the Q is always the probability of failure 1- P is 9/10.0829

The number of × that they want to win, that R, that is 5.0836

We want to find the mean and the standard deviation of the time for them to win.0841

Let me remind you of the formula for the mean.0850

We had this on one of the earlier slides.0853

If you check back a couple slides ago, you will see this.0856

It was R/P, I will go ahead and write down the formula for the variance.0859

V of Y is RQ/P².0865

In this case, our R is 5, our P is 1/10, 5 divided by 1/10, flip on the denominator is 50.0871

That is the expected time to fill up their trophy case.0882

That, by the way, is a very intuitive answer.0886

It definitely conforms to your intuition which is that while on average, they have a 1/10 chance of winning each year.0890

On average, they are going to win about once every 10 years.0898

If they want to win 5 ×, we expect it to take about 50 years for them to bring home 5 trophies.0903

Let us keep going with the variance here.0909

RQ is 5 × 9/10, P² is 1/100.0912

If I flip, I get 100 × 45/10 which simplifies down to 450, that was the variance, that is not the standard deviation.0924

The way you get the standard deviation is you calculate the variance first and then you take it square root.0936

Let me go ahead and label what I’m doing in each step here.0943

That was the mean, this is the variance, and now I'm about to calculate the standard deviation.0947

The standard deviation is always the square root of the variance, √ 450.0960

That simplifies a little bit, I can take a 9 out of there right away.0968

When it comes outside, it will be a 3, 450 is 9 × 50.0973

I can take a factor 25 out of 50, pull out the square root, I’m going of another 5 outside the square root.0977

15 × √ 2.0984

That does not simplify anymore, but I did throw that into my calculator,0989

and got a decimal approximation, it was 21.21 years.0995

What that tells me is if Akron arvarks have built this lovely new trophy case with space for 5 trophies1009

and we want to know how long you are going to have to wait until their trophy case is full,1017

on average they are going to have to wait about 50 years.1023

The standard deviation on that estimate is 21.21 years.1025

Let me go back over those steps.1033

This is a negative binomial problem, we want to identify the probability of winning on any given year.1035

That is 1/10, that comes from the 10% right here.1042

That means the probability of losing the Q is 9/10 and we want to win 5 years total.1047

That is why we put in our R is equal to 5 there.1058

I recalled the formulas for the mean and variance.1061

I got these back off on the slides just a little bit earlier in the lecture.1065

Just flip back a couple of slides in the lecture and you will see these formulas for the mean and the variance.1070

We just drop in the numbers for R, Q, and P.1075

We get the mean of 50 years.1078

Q or the variance gives me 450 and that is not the answer, that is not the standard deviation.1081

To get the standard deviation, you take the square root of the variance.1088

√ 450 simplifies down to 21.21 years, that is the standard deviation of waiting time for their trophy case to be full.1093

In example 3 here, we are going to roll a dice until we get four 6.1109

I just keep rolling and then I just keep track of the number of times I have seen a 6.1114

I want to find the mean and the standard deviation of the number of rolls we will make.1120

Let us first recognize this as a negative binomial distribution because every time we roll a dice,1126

either we get a 6 and we call that a success or we do not get a 6 and we call that a failure.1136

Roll, roll, until we get a 6 that is a success, do not get a 6 it is a failure.1141

We want to get four 6 total, we want to get 4 success.1146

Let me record the parameters of this distribution.1152

Our probability of getting a success, our probability of rolling a 6 when we roll a dice is 1/6.1158

Q is always 1- P, that is the probability of failure is 5/6.1164

R is the number of successes we want to get here, when we get four 6, that is 4.1172

Our expected value of Y is always, this is the mean, that is always R/P.1179

I told you that on the third slide of this lecture, the same time1188

that I told you the variance of the negative binomial distribution is V of Y is RQ/P².1193

Let me go ahead and fill in the numbers here.1205

R/P is 4 divided by 1/6, do the flip there that is 4 × 6 is 24, that was our mean.1206

That tells us on average, it will take us 24 rolls.1218

Not at all surprising there because on average, we are going to roll a 6 once every 6 rolls.1221

If I want to get four 6, it will take 24 rolls.1229

The variance is not intuitively obvious.1232

R is 4, Q is 5/6, let us multiply it by 5/6, that is not a mixed number 4 and 5/6.1236

4 × 5/6 and P² is 1/6², 1/36.1247

If I do the flip there, I will get 36 × 4 × 5/6.1255

I can simplify that 36 into a 6, that 6 goes away.1264

6 × 4 × 5 is 6 × 4 is 24 × 5 is 120.1269

That is variance, not standard deviation.1279

To get standard deviation, you just take the square root of the variance, that is always true.1282

You usually compute the variance first and then just take its square root, √ 120.1289

120 has factor of 4, I can pull 2 out of there.1298

It leaves me with 30 under the square root and that does not really simplify anymore1303

but I did put that in my calculator, before I started this.1309

What my calculator told me was 2 √30 is approximately 10.95.1313

My units there are the number of rolls that I'm going to have to do, in order to see four 6.1320

That wraps up that one, let me recap the steps there.1331

We identify that as a negative binomial distribution because it is a process we are repeating over and over,1335

until we get success a certain number of times.1342

In this case, success is defined as rolling a 6 and we want to get four 6, 4 successes.1345

I have identified my parameters there, P is the probability of rolling a 6 on any given roll, that is 1/6.1355

Q is always 1- P and R is the number of success that we are looking for, that is 4, we were told.1362

That is why I got R = 4.1371

The mean of the negative binomial distribution and the variance of the negative binomial distribution,1375

those are formulas that I gave you on one of the earlier slides, in terms of R and P, and Q.1382

The mean is R/P, variance is RQ/P².1388

I just have to drop in the numbers that I had already written down above.1393

The P is 1/6, Q is 5/6, R is 4.1397

Drop those in and simplify them down.1402

I got a mean of 24 rolls which is very intuitive because if you get a 6 every 6 rolls on average,1404

and you want to get four 6, it is going taking 24 rolls on average.1413

The variance is less intuitive, we just kind of follow the formula and these number simplify down to 120.1418

To get the standard deviation, you always take the square root of the variance and1429

that simplifies down to an approximation of 10.95 rolls.1433

In example 4 here, we got a company which is interviewing applicants for a job.1442

Apparently, the company has 3 positions to fill.1451

Perhaps, they are looking for 3 programmers, they are all identical positions.1454

They start interviewing people and it turns out that exactly 10% of all the possible applicants1461

actually have the qualifications for the job.1468

They actually know the right programming languages and they have the other skills necessary to do the job.1471

Every time we interview somebody, there is a 10% chance that they will be good enough and we will hire them.1480

We are going to keep interviewing and interviewing and interviewing one person at a time until we get 3 good people.1488

I will hang onto them and at that point, we will close the door, everybody else has to wait.1494

There are two questions here.1499

What is the probability that they will interview exactly 10 applicants and then the probability1500

that they will interview at least 10 applicants?1506

Let us identify this as a negative binomial distribution.1510

We are doing trials, each one has success or failure, meaning we talk to an applicant.1516

If they have the skills, we hire them.1524

If not, we show them the door.1526

Because we are looking for 3 successes, we are looking for 3 good people right now.1529

That kind of plays into the definition of negative binomial distribution, when you are looking for a fixed number of successes1537

and you will just keep interviewing and keep looking and looking and looking, keep running trials until you get those 3 successes.1544

Let me identify the parameters here.1552

First of all, the probability of getting a success on any given trial is 1/10, that is because 10% of the applicants have the skills.1554

Q is always 1- P, in this case 1 -1/10 is 9/10.1562

R is the number of successes you are looking for.1570

In this case, we are looking for 3 people.1572

For part A, we are trying to interview exactly 10 applicants.1577

We want to use Y is equal to 10, let me remind you of the formula for the negative binomial distribution.1589

It is always Y -1 choose R -1 × P ⁺R × Q ⁺Y – R.1598

Let me fill in the numbers because I think I have identified all the values of those numbers there.1610

Y was 10 in this case, let me go ahead and say P of 10 is Y -1 is 9, R -1 is 3 -1 is 2.1618

P is 1/10, 1/10 ⁺R is 3 and Q is 9/10, Y - R is 10 -3 is 7.1633

This simplifies a little bit, it is not great but 9 choose 2, I can simplify that as 9 × 8 divided by 2.1649

That is because there is a 7! top and bottom that get canceled there.1660

That is 9 ×, let us se, 9 × 4 is 36 and this is 36.1664

It looks like I have a 9⁷ in the numerator and a 10 ⁺10 in the denominator.1677

And that does not really seem like it is going to get any better.1684

I’m just going to leave that the way it is.1688

You could find a decimal for that, I did not bother to put that into our calculator1691

and convert it to decimal because it was not a very revealing answer that I got.1695

It would be a fairly small decimal.1702

You do not expect to interview exactly 10 people and get lucky enough to get 3 good people in those first 10.1705

That is not very likely.1712

It looks like I run out of space to answer part B.1714

I’m going to jump over to the next slide to answer part B.1717

Let me recap how we answered part A before I sit for the next slide.1721

It is a negative binomial distribution here, we want to keep interviewing people until we get exactly 3 successes.1729

Our probability is 1/10, that is coming from this 10% chance of succeeding on any given applicant.1737

Q is 9/10, it is always 1 – P.1745

R = 3, that comes from 3 positions to fill.1748

That is why we are looking for R is equal to 3 here.1755

And then for part A, we want to interview exactly 10 applicants, that is why we are using Y is equal to 10.1758

I just dropped the Y, R, P, Q, into my generic formula for the negative binomial distribution.1765

I got 9 choose 2 × 1/10³, 9/10⁷.1774

I simplified that a little bit but it did not really simplify into anything very nice.1779

For part B, let me go ahead and jump over to the next slide and we will do that.1788

This is still example 4, we are still trying to interview applicants to get 3 qualified people for these 3 job openings that we have.1795

The question is what is the probability of getting at least 10 applicants?1804

The way you want to think about that, let us think about why you would interview 10 applicants?1809

That really means that, since you are looking for 3 people, you failed to get 3 people in the first 9 applicants.1815

Let me write the answer to part B as the probability that we do not get 3 qualified applicants in the first 9 people we interview.1824

That is the way to think about that.1846

If you think about that, that means we kind of looking at those first 9 people and1849

saying what is the chance that among those 9 people, we have fewer than 3 qualified applicants.1854

What that really is, is among those first 9 people, in the first 9 applicants, how many winners are we looking at?1865

If we are not going to get 3 that means we got either 0 winners, or 1 winner, or 2 winners, in the first 9 applicants.1882

This is no longer a negative binomial distribution because now we are looking at a fixed number of people, 9 applicants.1895

What we are really doing here is, we are now using not the negative binomial distribution but the binomial distribution.1905

Binomial distribution not the negative binomial distribution1918

because we are looking at a fixed number of people and asking what the probability is of getting a certain number of successes.1927

Let me remind you of the formula for the binomial distribution.1935

It is different from the formula for the negative binomial distribution.1939

For the binomial distribution, P of Y is equal to N choose Y × P ⁺Y × Q ⁺N – Y.1942

If you do not remember that, just check back two videos here on

I think it was two videos ago where I had a lecture on the binomial distribution.1960

You can work through the video on the binomial distribution and then you will be ready to tackle the rest of this problem.1967

In this case, we are talking about 9 applicants.1975

Our N is the number of trials, N is 9.1977

The probability of getting a winner on any particular applicant is still 1/10, Q is still 1- P is 9/10.1982

The Y is the number of applicants we are hoping to get.1995

This is Y equal to 0, this is Y = 1, Y =2.2000

Let me fill in each one of those according to the formula.2005

That is 9 choose 0 × P is 1/10, 1/10⁰ × 9/10⁹ - 0 + 9 choose 1 × 1/10¹2008

× 9/10⁸ + 9 choose 2 × 1/10² × 9/10⁹-2 which is 7.2031

These numbers actually do combine in a fairly pleasant way.2051

I have worked out the fractions ahead of time and they were pretty nice.2055

Let me go ahead and then play with this a little bit.2058

9 choose 0, you can work that out from the formula but it is also the number of ways of choosing 0 things,2061

and one way to choose 0 things which is just to take the empty set.2068

That is 1 × 9⁹/9/10 + 9 choose 1, you can use the formula but that is number of ways2072

to choose the one thing out of 9 and there is definitely 9 ways to do that.2085

9 × 9⁸ /, it now looks like there is going to be a 10⁹,2089

On the first one, I accidentally wrote 9 ⁺10.2097

What I meant was 10⁹, be careful about that.2101

Let me go back and look at 9 choose 2.2106

9 choose 2, we work this out before, it is 9 × 8/2 which simplifies down to 36.2111

36 × 9⁷/10⁹ in the denominator.2121

This simplifies a bit, we can out a 10⁹ as a common denominator.2130

It looks like all of these will have a factor of 9⁷.2137

On the first one, it is 9⁹ so there are two 9 left, 81.2145

81 + on the second one there is 9 × 9⁸, we pull out 9⁷ and that will be 81.2150

There is a 36 here and I was trying to be more clever after that and it did not really work out.2158

There was nothing good that happens.2164

I just threw those numbers into my calculator and I got kind of a huge number here.2167

I will just copy it down, 473513931/5 × 10⁸.2171

What happened was there was a factor of 2 that cancel out a 10⁹ there.2184

That is not a very revealing number but I wrote it down as a decimal.2189

What I got was 0.947, 94.7%.2197

That is the chance that we will interview at least 10 applicants and that is really not very surprising.2203

You expect that chance to be pretty high.2210

Remember, what is going on here is we are interviewing applicants until we get 3 applicants that have the necessary skills.2214

On average, 1 in 10 people have the necessary skills.2223

We are interviewing people and the question is, what is the chance that we have to interview at least 10 people to find 3 winners?2228

It is pretty highly likely that we will have to interview at least 10 people.2239

It is not very likely that you will get 3 winners out of the first 9 applicants, if we are only looking for 1 in 10, in general.2245

I guess it is about a 5% chance that you will get all your winners in the first 9 applicants.2254

There is almost 95% chance that it will take you interviewing at least 10 people or more.2260

Let me recap the steps here.2267

The way to think about this is to realize that interviewing at least 10 applicants means that if you have to talk to 10 people,2271

that means the first 9 people did not give you enough good ones.2283

You do not get 3 good ones out of the first 9.2286

If you think about that, that is really asking what if I interview 9 people, what is the chance of not getting 3 winners?2291

That is a fixed number of trials because there is 9 trials and we want to get fewer than 3 winners.2299

We are interested in getting certain of number winners.2305

That is binomial distribution not a negative binomial distribution anymore.2308

Negative binomial is open ended where you just keep interviewing and interviewing,2313

until you get a certain number of winners.2317

Here, we are asking about 9 interviews total, what is the chance I’m not getting 3 winners?2320

If you are not going to get 3 winners that means you got 0, or 1, or 2.2326

We just want to add up those 3 probabilities and we are going to use the formula for the binomial distribution,2332

not the negative binomial distribution.2338

This is the binomial distribution here and I just took this formula, and I plugged in the different values of Y, P, and Q, into this formula.2340

Then I simplified down the fractions.2351

They started out to combine nicely and gave this lovely number so I just converted it into a decimal.2355

That is the chance that you will end up having to talk to at least 10 people, and it is quite likely.2362

If you are interviewing for 3 jobs, you want to fill all 3 jobs and only 1 in 10 people are good enough.2368

The chances are you are going to talk to at least 10 people, probably significantly more.2375

There is almost a 95% chance that you will talk to 10 people.2379

This example, we are going to keep the basic premise of this example for the next problem.2387

You want to hang on to the scenario here, this business about the company interviewing to fill 3 jobs.2392

We are going to hang onto that, we use the same numbers and then2401

we are going to introduce the wrinkle of how long each interval takes, each interview takes, in the next problem.2404

In example 5 here, we are going to be referring back to example 4.2412

If you have not just worked through example 4, what I want you to do is go back and read over the scenario from example 4,2416

because we need that to understand example 5.2423

What this company is doing is they are interviewing applicants for a job and they have 3 openings.2428

They want to keep interviewing until they get 3 people worthy of their openings.2435

Now, they are telling us that it takes them 3 hours to interview an unqualified applicant, 5 hours to interview a qualified applicant.2442

Remember, we are going to keep interviewing until we get 3 total qualified applicants.2451

Let us think about how long that will take and try to set up a formula for that.2458

I’m going to set up, variable T is the time to find 3 qualified applicants.2463

Let us think about how that would break down.2477

If you are going to find 3 qualified applicants, remember Y is the number of applicants overall.2481

The number of applicants that we are going to talk to overall.2491

Some of them are qualified and some of them are not.2495

Y is the total number of applicants.2499

We are going to keep interviewing until we find 3 good ones.2502

What that means is if we talk to Y people total and there are 3 good ones, then there are Y -3 bad ones.2506

Each one of those bad people, each one of unqualified bombs is going to take us 3 hours to talk to.2515

3 × Y -3, that is how much time we spent talking to people who are unqualified.2522

There is also 3 good people and each one of those is going to cost us 5 hours to check them out,2529

run the background check, and really confirm that those are good people for our job.2535

Let me as simplify this expression.2542

I get 3 Y - 9 + 15 which is 3 Y + 6.2544

What I really want to do is calculate the mean and standard deviation of T, of 3 Y + 6.2552

Let me calculate first the mean and standard deviation of Y itself.2558

To do that, I'm going through the variance because that for me is the easy one to calculate.2570

First, I will calculate the mean of Y, the expected value of Y.2577

Remember, the mean and expected value are the same thing.2582

The formula we have for that is R/P, that is one of our earlier formulas.2586

I think it is on the third slide of this lecture.2590

That is for the negative binomial distribution.2593

The variance V of Y is RQ/P².2596

We already have our values for R, P, and Q.2604

Let me remind you what they are here.2607

The P was the probability that any given applicant has the skills, that is 10%, that is 1/10.2611

Q is always 1- P, that is 9/10.2618

R is the number of successes that we are looking for.2625

In this case, we have 3 job openings, I got that by the way from example 4, the fact that we have 3 job openings.2629

R is 3 in this case, let me go ahead and calculate this mean and variance using those numbers.2636

3 divided by 1/10 is 3 × 10, that is 30.2644

R here is 3, Q is 9/10, P² is 1/10², 1/100 so that is 100 × 27/10.2652

I’m doing to flip /100 that is 10 × 27 is 270.2667

270, but that is the variance and the mean of Y not of T.2673

In order to find the mean and variance of T, we have to remember a couple of rules of probability here.2678

Let me remind you what those were.2687

The expected value of AY + B, expectation is linear, it is A × E of Y + B.2688

Variance is not linear, variance of AY + B.2700

The interesting thing is that the B does not affect it at all.2704

Essentially, variance measures how much a variable wobbles.2707

If you take a variable and just move everything over, that does not change how much it wobbles.2711

There is no N in the answer, it is A² × V of Y.2718

We are going to use those two values to help us calculate the mean and the variance of T.2724

E of T, T was 3 Y + 6, work that out up above there.2732

E of 3 Y + 6 which using my little formula over there is, the A is 3.2740

That is A, that is B, it is 3 E of Y + 6.2750

The E of Y was 30, that is 3 × 30 + 6 which simplifies down to 96.2759

Ours units here are hours, we get 96 hours is the expected time that2766

it will take this company to interview all these people and get 3 qualified applicants.2777

The variance, which is not really what we are looking for, we are looking for the standard deviation but it is very useful to find the variance2784

because I can just take the square root of that to find standard deviation.2794

The variance of 3 Y + 6, are I’m going to use my formula for A² V of Y, that is 9 × V of Y which is 9 × 270.2798

I’m going to leave that factored because the next thing I'm going to do is take its square root.2813

It would be easier to do if it is factored.2819

Let me find the standard deviation now.2821

The standard deviation of that is always the square root of the variance, V of T.2826

√ 9 × 270, that is why I left it factored is I can pull out a 3 √ 270.2836

I know I can pull out another 3 because there is still a factor of 9 over there.2844

3 × 3 × √30.2850

9 √ 30, that is not going to get any better until I pull out a calculator.2853

I did throw that into my calculator and got 49.295 hours.2864

That was of course an approximation, that is my standard deviation and2875

the time that the company should budget to conduct all these interviews.2882

That answers both of the questions that we repost there.2890

Let me show you where those all came from.2894

First of all, the basic parameters of this problem came from example 4.2897

If you are a little mystified as to where these numbers up here came from, just go back and look at example 4.2900

The premise of that problem was that 10% of the people we are interviewing are actually qualified.2907

We have a 10% chance of success every time we invite someone in to the office.2913

That means we have a 9/10 chance of failure.2920

9/10 of the people do not have the right skills for this particular job.2923

We have 3 job openings, that is where the R = 3 comes from.2926

The tricky part here is to set up an expression for the time to find the 3 qualified people that we are looking for.2932

If Y is the total number of people we interview then that means 3 of them are qualified.2943

3 of them are going to get that full 5 hour interview, that is where that 3 came from.2949

All the rest of the people are unqualified, that is Y -3 people are left over and2955

they are going to get the shorter interview, the 3 hour interview, before we realize that they are unqualified.2961

If you just simplify the arithmetic there, it simplifies down to 3 Y + 6.2968

We are going to have to find the expected value and the standard deviation of that.2974

I dredged up a couple of old and very useful rules on expectation and variance.2980

The expected value of AY + B, since expectation is linear, it is just A × expected value of Y + B.2986

With variance, it is not linear.2994

What happens is the B disappears and you get A² coming out.2995

The mean and the variance of Y by themselves, we will need those as a steppingstone to find the mean and variance of the time.3002

These are formulas that I just got off the third slide in this video.3011

Just check that, you will see that R/P, that RQ/P².3017

And I'm just dropping the values of P, Q, R to get the 30 and the 270 here.3021

I have to find the mean and variance of T.3028

Remember, T is 3 Y + 6, that is where I use my old linearity formula.3032

I drop in 3 × expected value Y + 6, that 30 /come from here and that is how I get 96 hours.3038

This company should plan to spend about 96 hours, if they intend to get 3 good people to fill their 3 jobs.3049

The variance there, this 3² is 9, that is using this formula right here.3057

The 6 just plays no role at all in there because it is like this B, that just disappears.3063

We get 9 × 270.3069

The standard deviation is always the square root of the variance.3072

I do √9 × 270 and that reduced down to a decimal approximation of 49.295 hours.3075

That wraps up this lecture on negative binomial distribution.3087

This is part of the probability series here on

My name is Will Murray, thank you very much for joining us, bye.3096