For more information, please see full course syllabus of Probability

For more information, please see full course syllabus of Probability

### Geometric Distribution

Lecture Slides are screen-captured images of important points in the lecture. Students can download and print out these lecture slide images to do practice problems as well as take notes while watching the lecture.

- Intro
- Geometric Distribution
- Geometric Distribution: Definition
- Prototypical Example: Flipping a Coin Until We Get a Head
- Geometric Distribution vs. Binomial Distribution.
- Formula for the Geometric Distribution
- Key Properties of the Geometric Distribution
- Geometric Series
- Example I: Drawing Cards from a Deck (With Replacement) Until You Get an Ace
- Example II: Mean & Standard Deviation of Winning Pin the Tail on the Donkey
- Example III: Rolling a Die
- Example III: Setting Up
- Example III: Part A
- Example III: Part B
- Example III: Part C
- Example III: Summary
- Example IV: Job Interview
- Example V: Mean & Standard Deviation of Time to Conduct All the Interviews

- Intro 0:00
- Geometric Distribution 0:22
- Geometric Distribution: Definition
- Prototypical Example: Flipping a Coin Until We Get a Head
- Geometric Distribution vs. Binomial Distribution.
- Formula for the Geometric Distribution 2:13
- Fixed Parameters
- Random Variable
- Formula for the Geometric Distribution
- Key Properties of the Geometric Distribution 6:47
- Mean
- Variance
- Standard Deviation
- Geometric Series 7:46
- Recall from Calculus II: Sum of Infinite Series
- Application to Geometric Distribution
- Example I: Drawing Cards from a Deck (With Replacement) Until You Get an Ace 13:02
- Example I: Question & Solution
- Example II: Mean & Standard Deviation of Winning Pin the Tail on the Donkey 16:32
- Example II: Mean
- Example II: Standard Deviation
- Example III: Rolling a Die 22:09
- Example III: Setting Up
- Example III: Part A
- Example III: Part B
- Example III: Part C
- Example III: Summary
- Example IV: Job Interview 35:16
- Example IV: Setting Up
- Example IV: Part A
- Example IV: Part B
- Example IV: Summary
- Example V: Mean & Standard Deviation of Time to Conduct All the Interviews 41:13
- Example V: Setting Up
- Example V: Mean
- Example V: Variance
- Example V: Standard Deviation
- Example V: Summary

### Introduction to Probability Online Course

### Transcription: Geometric Distribution

*Hi and welcome back to the probability lectures here on www.educator.com, my name is Will Murray.*0000

*Today, we are going to learn about the geometric distribution.*0005

*It looks a lot like the binomial distribution in the initial setup because it is describing a similar situation.*0009

*I will try to make it clear how the geometric distribution is actually different from the binomial distribution.*0016

*The idea of the geometric distribution is that you have a sequence of trials.*0023

*Each one of these trials can have two outcomes, you think of those as being success or failure.*0028

*Very typically, you think of this as being a sequence of coin flips, you are flipping a coin,*0034

*but it can really be anything where there are 2 possible outcomes.*0039

*For example, each year you want to know if the Yankees are going to win the World Series.*0043

*Each year, either they win the World Series or they do not win the World Series.*0048

*There is a yes or no outcome every single time you run the trial.*0053

*The key point about the geometric distribution is that you continue the trials indefinitely until you get the first success.*0057

*For example, if you are flipping a coin, you would keep flipping a coin over and over again*0069

*until you get the first head and then you would stop.*0076

*Or if you are tracking the Yankees winning the World Series, you wait and wait and wait as many years it takes,*0080

*until the Yankees win the World Series for the first time and then you stop.*0086

*The difference between that and the binomial distribution is that, we do not know the number of trials in advance*0091

*and we stop after we get the first success.*0098

*Remember, the binomial distribution, we have a fixed number of trials that we decide ahead of time.*0100

*We say, I'm going to flip this coin 10 × and I will count the number of heads, that was the binomial distribution.*0106

*This is the geometric distribution when we say, I'm going to flip this coin as long as it takes until I get the first head.*0112

*That is the difference between those two.*0121

*You will be really careful when you are studying a new situation, and when talking about the geometric distribution or the binomial distribution.*0122

*I still now have given you any formulas for the geometric distribution.*0134

*The way it works is you have a fix parameter which is the probability of success on each trial.*0137

*If you are flipping a fair coin then P would just be ½.*0144

*If you are tracking the Yankees winning World Series, and you figure that each year,*0147

*they have a 10% chance of winning the World Series then P would be 1/10.*0152

*Q is going to be the probability of failure that is always 1 – P.*0157

*Q is just dependent on P.*0162

*You do not really need to know Q ahead of time because as long as you know P, you can work out what Q is.*0165

*The random variable you are keeping track of is the number of trials that you have to take, in order to get that first success.*0170

*That is another different aspect between the geometric distribution and binomial distribution.*0178

*In the binomial distribution, Y was the number of successes that you get in total.*0182

*In the geometric distribution, Y is the number of × it takes to get the first success.*0189

*We are ready to actually get the formula for the binomial distribution.*0197

*The probability of getting exactly Y trials is equal to Q ⁺Y -1 × P.*0201

*Here, Y can be any number between 1 or it can be arbitrarily large, that is why I put Y less than infinity there.*0209

*This should be fairly easy to remember because what this really represents is Q ⁺Y -1 means you have to fail on the first Y -1 trials.*0216

*If you want to get the first head on your 6th coin flip that means the first 5 coin flips has to come up tails.*0229

*What you are really doing here is your failing on the first Y -1 trials.*0238

*Your first 5 coin flips have to be tails and then this 1 power of P at the end means you exceed on the Yth trial.*0249

*If you are flipping a coin that means on the 6th time you flip a coin, you will only get a head.*0262

*There is a P probability of succeeding on the Yth trial.*0267

*Word of warning here, we have the same bad notation here that we had for a lot of our other distributions*0278

*and a lot of our other formulas in probability, which is that we are using P to represent two different things here.*0285

*This P right here and this P right here are 2 different p's.*0295

*That P, remember, is the probability of success on 1 trial, probability of success on any given trial.*0301

*If you are flipping a coin and it is a fair coin, that P is ½.*0316

*If you are rolling a dice and you are trying to get a 6 then that P would be 1/6.*0320

*This P represents the probability of Y, your random variable having the value of little Y, the probability of Y trials /all.*0325

*That is unfortunate that people use the letter P for many different things.*0348

*It is a curse of the word probability starting with the letter P.*0353

*When you study probability, people tend to overuse letter P.*0356

*Everything is called P.*0360

*Unfortunately, you have to keep track of these and never use lowercase P for both of these, these are 2 different uses of the letter P.*0362

*Just keep track of that.*0372

*But having kept track of that, it should be pretty easy to remember this formula for the geometric distribution*0374

*because you just remember that you keep flipping a coin until you get your first head,*0380

*which means you have to fail on all the previous tries and then succeed on try number Y.*0385

*You fail Y -1 ×, that is what the Q - 1 gives you and you succeed on the very last time.*0393

*That is why there is 1 power of P to represent that final success there.*0401

*Let us keep going with this.*0407

*There is a couple key properties that we need for any distribution.*0408

*I will just list them out here.*0412

*The mean, remember, that is the same as the expected value.*0414

*Mean and expected value are the same thing.*0416

*The mean or the expected value for the geometric distribution is just 1/P.*0425

*The variance for the geometric distribution is Q/P².*0430

*Remember that Q is 1 – P, if you might see that written as 1 – P/P², those mean the same thing.*0435

*Standard deviation is always the square root of the variance.*0446

*It is just the square root of Q/P² which simplifies down to √ Q/P.*0450

*You can also write that as the √ 1 - P that would also be legit to say that that is the standard deviation.*0456

*There is a very useful fact that I want you remember from calculus*0468

*because it comes up a lot when you are doing geometric distribution problems.*0473

*That fact is the sum of an infinite geometric series.*0477

*Let me remind you of the formula for the sum of an infinite geometric series.*0481

*We covered this in calculus 2, if you do not remember this, you might want to go back and review this section of calculus 2*0485

*because we use it a lot in probability, the sum of the geometric series.*0493

*What we learn back in calculus 2 is, if you have a series A + AR + AR², that is a geometric series*0497

*because to get through each term to the next, you are multiplying by R, you are multiplying by R each time.*0506

*I did not write this down on the initial slide but this only works if the common ratio has absolute value less than 1.*0514

*In this case, the sum of the geometric series is given by A/1 – R.*0525

*I think that formula is not very useful to remember.*0534

*I think it is a much more useful to remember the sum of geometric series in words.*0538

*The way I remember it is it is the first term divided by 1 - the common ratio.*0543

*That first term was A and the common ratio is the amount you are multiplying by to get from each term to the next.*0549

*The reason I like that formula better is because it avoids a lot of special cases.*0557

*If you look in your calculus book, you might see different special cases for the sum of AR to be N.*0562

*When N starts at 0, we might see 1 formula there.*0570

*You might see another formula when N starts at 1 of AR ⁺N, a different formula.*0573

*You end up having to memorize all these different formulas depending on subtle differences in how the sum is presented.*0580

*The formula in words is always the same, no matter how you give the series.*0587

*If you remember that one, you will never go wrong.*0594

*I really like this formula the first term / 1 - the common ratio, that is what I tend to remember.*0596

*When I apply that to adding up the sum of the geometric series, it always works.*0605

*Let me give you an example of how that comes up, when we are studying the geometric distribution.*0611

*A lot of times, when you are studying a probability problem,*0617

*you want to say what is the probability that it will take at least a certain number of trials to achieve success?*0621

*If I'm flipping a coin, what is the probability that I will have to flip it at least 3 × before I see my first head?*0628

*If I'm waiting for the Yankees to win the World Series, what is the probability that it will take at least 10 years for them to win the World Series,*0636

*or what is the probability that they will win sometime in the next 10 years?*0644

*The way we want to calculate that is, we want to add up all the values that are bigger than or equal to Y.*0649

*We want to add up if we want to find the probability that it is at least little Y.*0659

*We look at the probability of little Y + the probability of little Y + 1 little Y + 2, and so on, we add that up.*0664

*I’m just going to use my formula for the geometric distribution P of Y is equal to Q ⁺Y -1 × P.*0673

*I fill that in for P of Y and then for P of Y + 1, I just increment the exponent by 1, Q ⁺YP then Q ⁺Y + 1P, and so on.*0684

*That is a geometric series.*0695

*What we are doing each time is we are multiplying each term by Q.*0697

*We got a geometric series.*0703

*I use my first term/ 1 - common ratio formula.*0705

*The first term here is Q ⁺Y -1, fill that in.*0710

*The common ratio is Q, that is why I get 1 - Q in the denominator.*0716

*1-Q is just P, that was because Q itself was 1 – P, 1 – Q is P.*0722

*That cancels with the P in the numerator and we get just Q ⁺Y -1.*0733

*That is our probability that we will go at least Y trials until we get our success.*0741

*Another way to think about that is that in order for the experiment to last for Y trials,*0748

*say 10 trials, it means that you have to fail for the first 9 × that you run the experiment.*0754

*If you are going to run an experiment for 10 trials or more, that means you have to fail 9 ×.*0762

*Q was the probability of failure, you have to fail for the first Y -1 ×, that why you get Q ⁺Y -1 there.*0771

*Let us see how this plays out in some examples.*0781

*In the first example, we are going to draw cards from a deck until we get an ace.*0784

*We got a 52 card deck here, we just start pulling out cards until we get an ace.*0789

*A key question you should always ask about selection like this is do we replace the cards back in the deck before we draw the next one?*0794

*What I have told you in the problem is, if we do replace them that really change the answer.*0803

*You want to be very sure that you understand in probability questions, are you drawing with replacement or without replacement?*0809

*We are going to draw until we get an ace.*0818

*We are being asked what is the chance that we will draw exactly 3 × here?*0819

*Let us think about that.*0826

*This is the geometric distribution because we are doing a sequence of independent trials.*0827

*We are drawing a card, putting it back, trying another card, or possibly the same card.*0832

*Putting it back, drawing again, putting the card back.*0836

*We want to keep doing this until we get an ace and then we stop.*0839

*We want to find out how many times we are going to draw until we get that first ace?*0844

*Let us first of all figure out what our parameter is.*0850

*Our P is our probability of getting an ace when we draw a card from a 52 card deck.*0853

*There are 4 aces in there and there is 52 cards total, that is 1/13.*0861

*We have a 1/13 chance of success anytime we draw a card.*0867

*Q is always 1 – P, that is 12/13.*0871

*That is a chance it will fail on any particular drawing.*0876

*We have a 12/13 chance of getting something other than an ace.*0880

*Let me write down the formula for the geometric distribution.*0886

*P of Y is Q ⁺Y -1 × P.*0890

*In this case, our Y is equal to 3 because we want to know the chance that we will have to draw exactly 3 ×, P of 3.*0895

*In this case, Q is 12/13, I will fill that in, 12/13.*0905

*Y -1 is 3 -1 that is 2, P is 1/13.*0911

*I think I can simplify that a bit, it is 12²/13³.*0920

*I do not think I’m going to go ahead and multiply those out because I do not think the numbers will get any more illuminating there.*0925

*I will leave that as my answer.*0933

*You could multiply that out and get a decimal, if you wanted to.*0936

*Of course, there will be a number between 0 and 1, a fairly low number.*0939

*I do not think it will be very revealing, the decimal that you get.*0944

*That is our answer for example 1 here.*0949

*Just to recap how we did that, we figured out this is a geometric distribution*0953

*because we are running independent trials until we get a success.*0957

*The probability of success is, there are 4 aces out of 52 cards, it is 1/13.*0961

*Q was 1 – P, that is 12/13.*0968

*I will just drop those into my probability distribution formula.*0971

*The Y here is 3 because we want to draw exactly 3 × until we get an ace.*0975

*I filled in Q, I filled in Y, our Y -1, and I filled in P, and then I simplify that down to get an answer.*0982

*In example 2, we have the Akron Arvacks and they are competing in the northwestern championship of pin the tail on the donkey.*0993

*It looks like they have a 10% chance of winning the championship each year.*1002

*We want Y to be the number of years until they next win.*1007

*We are sitting there in Akron and we are hoping that the Arvarks are going to bring home the trophy this year.*1011

*We want to know how many years we might have to wait until we see that trophy.*1018

*We want to find the mean and standard deviation of that.*1023

*This is a geometric distribution once again, because we are waiting for the first success.*1025

*We are waiting for our home team to bring home that first trophy.*1030

*Our P, that is our probability of them winning in any given year, it is 10%.*1036

*I will write that as 1/10.*1042

*Our Q, the probability of failure , if they are not bringing home the trophy that is 1 – P.*1044

*Q is always 1 – P, that is 9/10, 1 -1/10.*1051

*I gave you the expected value formula which is the same as the mean in one of the earlier slides of this lecture.*1059

*It is always 1/P, that is 1/1/10 and that is exactly 10.*1068

*We want to express that in years.*1077

*That is not at all surprising, that is kind of the more intuitive results in probability.*1079

*The mean of the geometric distribution, if they have a 10% chance of bringing home the trophy in any given year*1087

*then on average, you expect to wait about 10 years until you bring home a trophy.*1097

*That is not surprising, about once every 10 years, they will bring home a trophy.*1104

*If you sit down the wait, on average it will take you about 10 years.*1110

*You will be waiting about 10 years to see them bring home a trophy.*1114

*The sigma² is the variance.*1117

*The first thing you figure out there was the mean.*1121

*Now, we are going to figure out the variance.*1124

*That is not what the problem is asking.*1126

*The problem is actually asking for standard deviation but the standard deviation is the square root of the variance.*1128

*Once you figure out the variance, it is very easy to find the standard deviation, we just take the square root.*1134

*Let us find the variance first.*1139

*The variance is sigma² is Q/P², that is 9/10 divided by P is 1/10.*1141

*P² is 1/100.*1153

*If I do the flip on the fraction then I get 100 × 9/10.*1156

*100/10 cancels to 10, I get 10 × 9 which is 90.*1163

*The standard deviation, once you found the variance that is very easy.*1168

*You just take the square root of the variance, the √ 90.*1177

*I can simplify that a little bit, I can pull a 9 out of the square root and we will turn into a 3.*1183

*I get 3 √10 and I just threw that into my calculator before I started this.*1189

*What I came up with was 3 √10 is 9. 487.*1198

*Our units there are years.*1204

*The standard deviation in the waiting time is 9.487 years.*1208

*That just means if you are waiting for Akron to bring home a trophy and pin the tail on the donkey,*1214

*you expect to wait about 10 years on average but the standard deviation is almost another 9 1/2 years on either side.*1220

*You could easily be waiting 20 years, for example, before they bring home their first trophy.*1230

*Let me remind you how we did each step there.*1237

*First identify that this was a geometric distribution because we are waiting for them to bring home their first trophy.*1240

*We are waiting for the first success in a sequence of trials.*1246

*Each year they go out, they play the championships, has a 10% chance they win everything.*1250

*That 10% is actually the probabilities, that is why I get the P=1/10, that came from the 10% there.*1258

*The Q is always 1 – P, I got the 9/10.*1269

*I just read off the mean and variance, and standard deviation formulas from one of the earlier slides in this lecture.*1272

*You can scroll back and you will see where those formulas come from.*1281

*The mean is 1/P that is 10 years.*1284

*We expect to wait about 10 years before we are going to bring home a trophy.*1287

*Very intuitive result, by the way.*1291

*The variance is Q/P², I filled in my Q and my P, simplified that down to 90.*1293

*The units on that would be years² which really is not very meaningful.*1300

*I did not write those in.*1305

*In fact, what we are looking for is the standard deviation.*1306

*That is the square root of the variance, always standard deviation you find it by taking the square root of the variance.*1310

*You always can find it that way.*1317

*That simplify down to 3 √10 and I made decimal of that 9.487 years is the standard deviation there.*1320

*In example 3 here, we are going to make things a little more complicated.*1331

*We are setting up a game between you and your friend and it is a fairly simple game but it will give us some interesting probability.*1334

*You just take turns rolling a dice and you get to roll first and then your friend rolls, and then you roll, then your friend rolls.*1343

*Whoever rolls a 6 wins and I guess the person has to pay them some money or something.*1350

*You are both trying to roll a 6.*1357

*If you a roll 6 right away, then you win.*1359

*If you fail to roll a 6 and then your friend rolls a 6, then your friend wins.*1362

*And then, you just keep going back and forth until the first person rolls a 6.*1366

*We want to ask several questions about this.*1372

*What is the chance that you will win on your third roll?*1376

*In exactly your third roll, you win the game.*1379

*What is the chance that your friend will get to roll 3 × or more?*1382

*What is the chance that you will win overall and that includes you winning on the first roll, maybe on your second roll, third roll, and so on.*1387

*Several different questions here.*1394

*This is a geometric distribution because if you can think of you and your friend rolling a dice together,*1396

*you are going to keep rolling the dice until the first 6 comes up, whether it is you or your friend rolls the 6,*1404

*you are going to keep rolling the dice until the first 6 comes up.*1409

*And then, the game is over, you make the payoff or somebody has to wash the dishes.*1412

*You do not roll it anymore.*1419

*That is definitely a geometric distribution because you are waiting for the first success.*1420

*This is geometric, probability is the chance of getting a 6 on any given roll, that is 1/6.*1426

*Our Q is 1 – P, that is 5/6.*1436

*Let us go ahead and answer these 3 questions.*1448

*I did not really left myself space on the slides, hang on to this because I’m going to go in the next slide and will answer the questions.*1450

*The first question is that you will win on your 3rd roll.*1459

*Let me remind you that you get to go first and you roll first and then your friend goes, and then you roll and then your friend goes,*1464

*and then you roll and so on, like that.*1469

*If you are going to win on your third roll, that means that we have to get that 6 on the 5th turn of the game.*1476

*That is exactly what would happen, in order for you to win on your 3rd roll of the game.*1487

*What we are really asking here is the probability that Y is equal to 5.*1495

*Let me remind you of the formula for probability distribution.*1501

*For geometric distribution, P of Y is Q ⁺Y-1 × P.*1505

*We already said on the previous slide, we already said that our P for this game is 1/6*1514

*because that is probability of getting a 6 on any particular roll.*1520

*Our Q is just 1 – P, that is 5/6.*1524

*In this case, P of 5 is Q ⁺Y -1 that is 5/6 ⁺Y - 1 -5 here, that is⁴ × P is 1/6.*1527

*I can simplify that down a little bit into 5⁴/6⁴ × 6⁵.*1541

*That is the probability that you will win on exactly your 3rd roll of the game.*1549

*I did not find the decimal for that, it will be a pretty small number because*1554

*it is not very likely that you will win exactly on the 3rd roll of the game.*1558

*What is the probability that your friend will roll 3 × or more?*1562

*What that is really asking is, were your friend rolling 3 × or more, that is the 6th roll of the game.*1568

*In order to get to that turn, you have to have 6 rolls or more.*1579

*What we are really asking here is the probability that Y is greater than or equal to 6.*1587

*We work that out using a geometric series on one of the earlier slides.*1597

*That is the probability, let me just remind you the probability that Y*1602

*is greater than or equal to any particular value of little Y is Q ⁺Y -1.*1609

*We work that using a geometric series, you can also think about that as you have to fail Y -1 ×,*1617

*in order to get the success on the yth turn or later.*1626

*In this case, our Q is 5/6.*1632

*Our Y -1 is 6 -1, that is 5.*1638

*That does not really simplify so I’m just going to leave that in that form.*1644

*That is our chance that the game will run 6 turns or more in total, which will give your friend a chance to roll 3 × or more.*1648

*What is the probability that you will win this game?*1661

*That means you can win on any turn.*1665

*This is going to be a little more complicated.*1667

*The probability that you will win means that the game ends either on the first turn or that it ends on the third turn*1669

*because if it ends on the third term that means you rolled last and you win, or it ends on the 5th turn and so on.*1680

*It really means what is the probability that this game is going to go an odd number of terms.*1686

*The probability of 1 + the probability of 3, we will fill in each of these, + the probability of 5, and so on.*1694

*That is the probability that you will win this game and that the game will be won on either the first turn,*1702

*or the 3rd term, or the 5th turn, or so on.*1709

*Let me fill that in.*1712

*The probability of 1, for each one of those, I'm going to use my geometric distribution formula.*1714

*Q ⁺Y-1 × P, I’m just going to leave it in terms of P and Q for now.*1721

*When I fill in P and Q are, in a moment.*1728

*Q ⁺Y-1P when Y is equal to 1 that is Q ⁺0P, that is P +, when Y =3 that is Q ⁺2P.*1732

*Y = 5 will give us Q ⁺4P, Y = 7 I will fill in one more, + Q ⁺6P, and so on.*1746

*What I see here is a geometric series, this is a geometric series with,*1760

*I want to figure out what the common ratio is between each term.*1777

*I see that to get from the first term to the next one, I’m multiplying by Q².*1780

*To get from that term to the next one, I’m multiplying by Q² and so on.*1786

*It is Q² every time, that is why it is a geometric series.*1791

*My common ratio is Q².*1794

*I can use my formula for the sum of the geometric series.*1798

*Remember my formula, the best way to remember it is in words, first term/ 1 - the common ratio.*1804

*In this case, the first term is P, the common ratio is Q² 1 - Q².*1818

*I think I’m going to fill in what actual numbers for the P and the Q².*1826

*P where was that, it is up here is 1/6.*1831

*1 - Q² 1 - Q was 5/6.*1838

*Q² is 25/36.*1842

*I want to simplify that a bit, we will multiply top and bottom by 36.*1846

*Let me separate those.*1853

*Top and bottom by 36 there, so 36 × the top 36 × the bottom.*1855

*I will simplify things on the bottom.*1860

*That will give me a 6 in the top and in the bottom I will get 36 -25 which simplifies down to 6/11.*1863

*That is your chance of winning the game.*1877

*Your chance of winning the game is 6 /11.*1879

*Notice, by the way, that is a little bit more than ½.*1882

*6/12 would be ½, 6/11 is slightly more than ½.*1885

*That makes complete sense because you have a slight advantage in this game.*1892

*The advantage is that you got to roll first.*1898

*You are a little more likely to get a 6 before your friend is.*1900

*It is not a big advantage because 6 are not all that likely when you roll the dice.*1906

*In the long run, it would not really make a difference who got to roll first.*1910

*But you get to pick up a slight advantage by rolling first and that is why 6/11 is slightly bigger than ½.*1914

*Let me go to the steps again, make sure that everybody was able to follow them.*1923

*The key thing here is we are playing this game where you roll and then your friend rolls, and then you roll and then your friend rolls.*1927

*You want to keep track of the terms of the game that is sort of odd-even, odd-even, between you and your friend.*1934

*If you are going to win on your third roll, your third role is the 5th roll of the game.*1941

*That is why I put P of 5 here and then I use my formula for the geometric distribution to get Q ⁺Y -1.*1947

*The Y was 5 × P and then I just simplified that down to 5⁴/6⁵.*1956

*If your friend is going to roll 3 × or more, that means your friend has to get to his third roll.*1965

*His third roll is the 6th roll of the game that is why we are looking at the probability that Y is greater than or equal to 6.*1972

*We worked out the probability generically for the geometric distribution on one of the earlier slides.*1981

*You can go back and check it out, if you have not watched it recently but that probability is Q ⁺Y -1.*1986

*Another way to think about that, we worked it out using geometric series before but you can also think about that as,*1994

*in order for the game to last 6 or more turns, it means we have to fail on the first 5 turns.*2001

*Fail means we have to not roll a 6 on the first 5 turns.*2008

*The chance of not rolling a 6 is 5/6 and to last 6 turns, we got to roll something else 5 × in a row.*2014

*That is where that answer comes from.*2024

*Finally, we are asking what is the probability the you will win?*2026

*You get to roll on all odd turns, the first, the third, the fifth, and so on.*2031

*We are asking what is the probability that the game is won on the first turn or on the third turn, or on the fifth turn?*2035

*Adding up those probabilities and each one of those, I use my geometric distribution formula.*2045

*The cool thing is that when I'm wrote all those out, I noticed that I had a geometric series.*2052

*I had a common ratio of Q² that let me use my geometric series formula.*2058

*By the way, I reminded you of that, the geometric series formula earlier on in this lecture.*2064

*You scroll back a couple of slides, you will see that.*2069

*The first term/ 1 - common ratio, the first term is, the common ratio was Q².*2072

*Then I filled in the numbers for P and Q, that came from up here.*2078

*Fill the numbers for P and Q down here, did a little bit simplifying with the fractions and it came down to 6 /11.*2083

*I did not convert that into a decimal but one thing I know for sure is that that is a little bit bigger than 50%,*2090

*which makes sense because you get to roll first, you are a little bit more likely than your friend to win this game.*2098

*It is a very plausible answer that it can come out to be a little bit over 50%.*2106

*Let us move on to example 4, here we have a company that is interviewing applicants for jobs.*2116

*They have a job opening and they are interviewing their applicants.*2122

*In the general population, 10% of the applicants actually possess the right skills.*2127

*Maybe they have to have knowledge of a certain computer applications, for example.*2132

*Or they have to have study probability.*2137

*Only 10% of applicants for this job actually are qualified for the job.*2139

*The company is just going to interview people over and over again,*2144

*until they find 1 person who is qualify for the job and then they are going to hire that person.*2147

*We are asking here the probability that they will interview exactly 10 applicants*2154

*which essentially means that the first 9 people will be bombs and the 10th person is that qualified person,*2159

*and they are going to hire the 10th person.*2166

*Part B here, we are going to calculate the probability that they will interview at least 10 applicants.*2171

*Maybe, they will have to interview 50 people before they find the perfect person for that job but it is at least 10.*2177

*This is a geometric distribution because we have a sequence of trials, ultimately ending in 1 success.*2184

*As soon as we get 1 success, as soon as we find 1 person who is qualified, we hire that person and then we send everybody also away.*2192

*We stop the interview process right there.*2200

*This is a geometric distribution, let me fill in my parameters here.*2203

*The probability of any given person being a worthy applicant for the job is 10%, that is 1/10.*2208

*The Q is always 1 – P, in this case that is 9/10.*2218

*Let me write down some of the formulas that we had earlier on in the lecture because those would be useful for this.*2224

*P of Y is Q ⁺Y -1 × P and the probability that Y is greater than or equal to any particular value is just Q ⁺Y -1.*2231

*Let us go ahead and work that out.*2247

*In this case, for part A, we want to know what is the probability that we will interview exactly 10 applicants?*2250

*I have 9 failures and the 10th person is the perfect person for the job.*2257

*That is the probability of getting exactly 10 and from my formula, my Q ⁺Y -1 P formula that is Q⁹ × P, which our Q was 9/10.*2263

*Let me raise that to the 9th power, multiply it by a single power of P 1/10 and that is 9⁹/10⁹ × another 10, 10 ⁺10.*2279

*I did not try to find the decimal for that, it would again be quite small.*2296

*If you plug that into your calculator, it should be a very small decimal*2299

*because it is quite unlikely that we would have to interview exactly 10 applicants.*2304

*Most likely, we will find somebody earlier than that or probably later than that.*2309

*Let us find the probability that they will interview at least 10 applicants.*2316

*That is the probability that Y is greater than or equal to 10.*2321

*Y is the number of people that we interview.*2327

*Using our formula up here, the Q ⁺Y -1, that is Q⁹ which is 9/10⁹.*2330

*I did not simplify that but that will be a bit bigger, 10 × bigger than our previous answer.*2344

*That is our probability that we will interview at least 10 applicants.*2360

*That is our answer for both of these.*2366

*By the way, we are going to hang onto this example for the next problem.*2368

*Do not let these numbers and the whole situation completely slip your mind.*2373

*In the meantime, let me quickly remind you where everything came from here.*2377

*In part A, we want to find the probability that they will interview exactly 10 applicants.*2382

*It is a geometric distributions so I’m using my geometric distribution formula right here Q ⁺Y -1 P.*2387

*Our Y here is 10 and we have Q⁹ × P.*2397

*The values of P and Q, I got the P from this 10% right here,*2404

*that is the probability that any given applicant will be a success and Q is just 1 – that, that is 9/10.*2409

*Drop those numbers in and simplify it down.*2416

*In part B, we want the probability of interviewing at least 10 applicants.*2419

*At least 10 means the probability that it will be greater than or equal to 10.*2425

*Using the formula we derived back in one of the earlier slides, several slides ago, the beginning of this video, it is Q ⁺Y -1, that is Q⁹.*2430

*The way to think about that is, in order to interview at least 10 applicants,*2443

*that means the first 9 applicants are failures.*2448

*Each one of those 9 people has a 9/10 chance of being a failure, that means we have to see 9 failures in a row,*2451

*in order to ensure that we end up talking to at least 10 people.*2459

*Like I said, hang onto these numbers for the next example because*2463

*we are going to use the same scenario for the next example, for example 5.*2468

*In example 5, we are going to look back at the company from example 4.*2475

*If you have not just watched example 4, you really need to go back and read that one before example 5 will make sense.*2479

*Checkout example 4, there was a company interviewing applicants for a job opening and*2489

*each applicant has a 10% chance of being selected.*2495

*We interview and interview until we find a good one and then we keep that person, and we stop interviewing.*2499

*What we are doing in example 5 is we are keeping track of how long this procedure will take.*2507

*Apparently, it takes 3 hours to interview an unqualified applicant and 5 hours to interview a qualified applicant.*2512

*All of these people that do not meet the qualifications is going to take 3 hours each for us to figure out*2531

*that these people are actually bombs and do not deserve to be here.*2538

*And then, we finally get somebody that we think is qualified, we are going to interview them*2543

*for an extra 2 hours just to make sure that they really are the right person for this job.*2547

*We want to calculate the mean and the standard deviation of the time to conduct all the interviews.*2551

*How long do we expect this interview process to take at this company?*2557

*Let me show you how to set that up, we have not really seen a problem like this before.*2563

*This is our first one.*2567

*Let me set up a variable that represents time here, T is going to be the time.*2571

*Remember, Y is the number of applicants that we speak to.*2578

*Remember, the deal here is we are going to keep interviewing until we find somebody good.*2592

*That means if we find many good on the 16th try, that means we interviewed 15 people*2600

*who did not measure up and then number 16 was the good one.*2606

*In general T is, we have all the people who do not measure up, there is Y -1 of them.*2610

*Each one of those people cost us 3 hours each.*2618

*They cost us 3 hours to find out that those people did not actually deserve the job.*2622

*The last person, the person who is good that we actually want to give the job to,*2627

*we have to do some extra scrutiny on that person.*2634

*It took us 5 hours to interview her because we wanted to make extra sure that she was really qualified for the job.*2637

*The total time is 3 × Y -1 + 5.*2644

*We can simplify that a bit, that is 3 Y - 3 + 5 which is 3 Y + 2.*2648

*That is the total time that it takes and we want to find the expected value, the mean of that, and the standard deviation.*2661

*Let me calculate first the expected value and the variance of Y because those are going to be useful intermediate steps.*2669

*Let me remind you here what our P was.*2680

*Our P was 1/10 for this problem.*2683

*That is because 10% of the applicants have the right qualifications.*2687

*Our Q is always 1 – P, it is 9/10 here.*2693

*Our expected value of Y, what we learned at the beginning of this lecture is that it is always 1/P.*2702

*In this case, it is always 1/P.*2712

*1/1/10 is just 10.*2718

*Let me go ahead and find variance of Y because that is going to be useful as a steppingstone to finding the standard deviation.*2724

*The variance of Y is always Q/P².*2732

*Again, it is coming from one of the first slides in this lecture.*2736

*You can scroll back and you can find that.*2740

*In this case, the Q is 9/10, P² is 1/100.*2743

*If we do a flip on the denominator, flip that up, we get john 900/10, that is 90.*2750

*That is the variance of Y.*2758

*What we are really want is the mean and standard deviation of T.*2761

*Let me go ahead and figure those out for T.*2766

*We want the expected value of T but that is the expected value.*2769

*T was 3 Y + 2 and it is time to remember some properties of expectation.*2774

*In particular, expectation is linear.*2784

*You can write this as 3 × E of Y + 2.*2786

*We can pull the 2 out because expectation is linear.*2792

*This is 3 × E of Y was 10, 3 × 10 + 2 is 32.*2795

*That is the expected amount of time to conduct all these interviews.*2805

*Our unit here is hour, let me go ahead and fill that in.*2809

*32 hours is the expected amount of time to conduct all these interviews.*2812

*The variance is more complicated and I want to remind you of an old rule in probability.*2821

*The variance of AY + B is equal to A² × the variance of Y.*2828

*There is no B in the answer.*2840

*That is an old rule in the probability that is very useful and we are going to invoke it right here, the variance of AY + B.*2842

*The B does not affect it, that is shifting all the data over.*2849

*It does not affect how much they vary.*2854

*You get A² × V of Y.*2856

*The variance of T here is the variance of 3 Y + 2 which is now, if I use my A is equal to 3*2858

*and my B is equal to 2 then I get A² × V of Y.*2872

*But B² is 9, so 9 × the variance of Y.*2877

*I figure out the variance of Y up here was 90.*2882

*That is 9 × 90 which is 810, that is the variance.*2885

*It is not the standard deviation, we are trying to find the standard deviation.*2892

*This was variance, this is the mean that we found up above here.*2896

*What I really want is the standard deviation.*2903

*The standard deviation is the square root of the variance of T which is √810.*2909

*I can factor 81 out of that, it is a perfect square so I get 9 × √10.*2920

*I did calculate that decimal for that, I calculated 9 √10 is about 28.46.*2928

*Since, this is a standard deviation my units there would be hours.*2939

*If you are a company and you are planning this interview process, you know that on average,*2946

*about 1 and 10 applicants is going to have the right skills.*2951

*On average, it will take about 32 hours to find the right employee for the job.*2954

*The standard deviation on that figure is 28.46 hours.*2962

*I guess that was an approximation, I should be clear that that was a calculator approximation right there.*2969

*Let me show you how I did that.*2977

*It was one of the trickier problems here.*2979

*What I want to do was write a formula for the amount of time it takes to conduct all the interviews.*2981

*Think of Y as being the number of applicants and T is the time we spend on them.*2989

*What we are doing is all the unqualified people that we speak to, we spent 3 hours each on them.*2995

*Remember, the last person to talk to is the person we hire.*3002

*As soon as we find a good person, we hire her.*3006

*That means all the previous people were unqualified, that is Y -1 people.*3010

*We spend 3 hours on each that is why we have Y - 1 there.*3017

*The last person cost us 5 hours because we want to give that last some extra scrutiny*3021

*and make sure that that person really is qualified for the job.*3029

*We get 3 × Y -1 + 5 that simplifies down to 3 Y + 2.*3032

*We want to find the mean, and the variance, and standard deviation of T of 3 Y + 2.*3038

*To find that, we need the mean and variance of Y itself.*3044

*Using those formulas that I gave you on one of the first slides in this lecture, the mean of the variance are 1 /P and Q/P².*3050

*The P was 1/10, that was coming from example f4, the previous slide.*3061

*Because 10% of the applicants are qualified, your chance of getting a qualified applicant is 1/10.*3067

*The Q is just 1 - P 9/10.*3076

*We drop those numbers in here and you get the mean and variance of Y as being 10 and 90.*3078

*To find the mean and the variance of T, to find the mean, we are going to use linearity.*3086

*The mean is linear, it is 3 × the mean of Y + 2.*3093

*3 × 10 + 2 is 32 hours.*3096

*The variance is not linear, we have this rule right here which tells us what to do with linear expressions in the variance.*3100

*The variance of 3 Y + 2, the 2 is the B but it turns out that that had no affect on the answer*3108

*because there is no B up here, it is just A² Y.*3118

*It is 9 × the variance of Y, 9 × that is our 90 right here, it is coming in here, we get 810.*3121

*And the standard deviation is always the square root of the variance.*3132

*We take √810 and I simplify that down and found the decimal to 28.46 hours.*3135

*That is our mean and our standard deviation, and the time to conduct all these interviews,*3145

*if you are in this company planning for how long it might take to fill your next job opening.*3151

*That is our last example, that wraps up the geometric distribution.*3158

*You are watching the probability lectures here on www.educator.com.*3163

*My name is Will Murray, thank you very much for watching, bye.*3167

1 answer

Last reply by: Dr. William Murray

Sun Jan 4, 2015 7:30 PM

Post by Carlos Morales on January 3, 2015

If you are dealing one card at a time from a shuffled deck. How many cards would it take before an Ace would come up (no replacement). I tried replacing P(y)=1 in Exercise 1 but my answer makes no sense.

1 answer

Last reply by: Dr. William Murray

Fri Sep 5, 2014 12:44 PM

Post by Ikze Cho on September 3, 2014

how would one figure out the probability of winning in less than y trials?