For more information, please see full course syllabus of Probability

For more information, please see full course syllabus of Probability

### Covariance, Correlation & Linear Functions

Lecture Slides are screen-captured images of important points in the lecture. Students can download and print out these lecture slide images to do practice problems as well as take notes while watching the lecture.

- Intro
- Definition and Formulas for Covariance
- Intuition for Covariance
- Covariance is a Measure of Dependence
- Dependence Doesn't Necessarily Mean that the Variables Do the Same Thing
- If Variables Move Together
- If Variables Move Against Each Other
- Both Cases Show Dependence!
- Independence Theorem
- Correlation Coefficient
- Linear Functions of Random Variables
- Linear Functions of Random Variables, Cont.
- Example I: Calculate E (Y₁), E (Y₂), and E (Y₁Y₂)
- Example II: Are Y₁ and Y₂ Independent?
- Example III: Calculate V (U₁) and V (U₂)
- Example IV: Calculate the Covariance Correlation Coefficient
- Example V: Find the Mean and Variance of the Average

- Intro 0:00
- Definition and Formulas for Covariance 0:38
- Definition of Covariance
- Formulas to Calculate Covariance
- Intuition for Covariance 3:54
- Covariance is a Measure of Dependence
- Dependence Doesn't Necessarily Mean that the Variables Do the Same Thing
- If Variables Move Together
- If Variables Move Against Each Other
- Both Cases Show Dependence!
- Independence Theorem 8:10
- Independence Theorem
- The Converse is Not True
- Correlation Coefficient 9:33
- Correlation Coefficient
- Linear Functions of Random Variables 11:57
- Linear Functions of Random Variables: Expected Value
- Linear Functions of Random Variables: Variance
- Linear Functions of Random Variables, Cont. 14:30
- Linear Functions of Random Variables: Covariance
- Example I: Calculate E (Y₁), E (Y₂), and E (Y₁Y₂) 15:31
- Example II: Are Y₁ and Y₂ Independent? 29:16
- Example III: Calculate V (U₁) and V (U₂) 36:14
- Example IV: Calculate the Covariance Correlation Coefficient 42:12
- Example V: Find the Mean and Variance of the Average 52:19

### Introduction to Probability Online Course

### Transcription: Covariance, Correlation & Linear Functions

*Hi, welcome back to the probability lectures here on www.educator.com.*0000

*We are working through a chapter on Bivariate distribution functions and density functions, *0004

*which means that there are two variables, there is a Y1 and Y2.*0010

*In this section, we are also going to have sometimes more than two variables that might be N variables.*0016

*We got a big section to cover today, it is going to cover covariance and correlation coefficient, *0022

*and linear functions of random variables.*0029

*I will be your guide today, my name is Will Murray, let us jump right on in.*0032

*The main idea of this section is covariance, the correlation coefficient is not something that is quite as important.*0040

*Let me jump right on in with the covariance.*0046

*The definition of covariance is not necessarily very enlightening.*0048

*Let me go ahead and show you the definition, but then I’m going to skip quickly to some formulas *0054

*that are probably more useful in dealing with covariance.*0059

*In the next slide, I will try to give you an intuition for what covariance means.*0063

*The definition of the covariance is that, you will have to start with two random variables.*0068

*You always have a Y1 and Y2.*0075

*You always talk about the covariance of two random variables at once.*0078

*By definition, it means the expected value of Y1 - μ1 and what × Y2 - μ2.*0082

*Here, μ1 and μ2 are the means or the expected values of Y1 and Y2.*0091

*I think that, that definition does not offer a lot of intuitive light onto what covariance means.*0098

*I will talk about the intuition, maybe on the next slide.*0105

*In the meantime, I will give you some useful formulas for calculating covariance, *0108

*because that definition is also not very useful for calculating covariance.*0113

*Often, the easiest way to calculate covariance is to use this formula right here, where you calculate the expected value of Y1 × Y2.*0117

*And then, you subtract off expected value of Y1 × the expected value of Y2.*0128

*That is usually the easiest way to calculate it.*0134

*By the way, for each one of these, you are going to have to calculate the expected value of a function of random variables.*0136

*We learned how to do that, in the previous lecture.*0144

*If you are not sure how you would calculate the expected value for example of Y1 × Y2, *0147

*what you want to do is watch the previous lecture here on the probability series on www.educator.com.*0153

*I went through some examples where we practiced calculating things like that.*0161

*You can see how you would calculate that.*0166

*A useful point here is that, if you ever have to calculate the covariance of Y1 with itself, *0169

*it is exactly equal to the variance of Y1.*0176

*If you have to calculate the covariance of any single variable with itself, it is just the same as the variance.*0180

*The way covariance behaves under scaling is that, if you multiply these variables by constants*0186

*then those constants just pop out, and you just get C² coming out to the outside.*0193

*That is a very useful if you have to deal with linear functions which is something *0198

*we are going to talk about later on, in this lecture.*0203

*That is the definition, which I do not really recommend that you use that often.*0208

*The definition for me is not useful, when calculating covariance.*0212

*These formulas are much more useful, specially this first one, it is the one that I used all the time,*0216

*when I’m calculating covariance.*0221

*That is definitely worth committing to memory.*0223

*I have not really told you yet, what covariance means when you are measuring.*0227

*Let me spend the next slide talking about that.*0233

*The intuition for covariance is that, it is a measure of dependents between your two variables Y1 and Y2.*0236

*It measures how closely Y1 and Y2 track each other.*0245

*There is an easy mistake that students make, when you are first learning about probability, *0254

*which is to think that dependence,*0259

*If two variables are dependent that means that they do the same thing.*0261

*That is not quite what dependence means, what dependence means is that knowing about what one variable does,*0265

*gives you more information about what the other variable does.*0273

*It does not necessarily mean that they move together.*0278

*It just means that one variable gives you some kind of guide, as to what the other variable is doing.*0281

*The way that behaves, in terms of covariance, is that if the variables do move together, *0287

*then covariance will be positive.*0294

*It shows that those variables are positively correlated.*0296

*If one is big then you expect the other one to be big.*0300

*If Y1 moves consistently against Y2, which means Y1 is big and the other one is small.*0305

*Or if the first one is small the other one is big.*0312

*They move like this, they move against each other, that is still dependence.*0314

*That would be reflected in the covariance, you will get a negative value for the covariance.*0322

*It means these variables are negatively correlated against each other.*0327

*Let me emphasize here that both of these are still considered to be dependence, to be examples of dependence.*0331

*Both of these show dependence.*0341

*And that is sometimes a little confusing for students, when you are first learning about probability.*0349

*Both of these are examples of dependence.*0354

*You think, wait a second, if two variables are moving against each other, are not those independent, *0357

*that kind of make you think of your intuition as being independent, if they are moving against each other.*0363

*Not so, that is dependent, if they are moving against each other.*0368

*The example that I use with my own students is, to imagine that you are a parent and*0373

*you have 2 twin children, may be 2 twin boys.*0378

*One of your children is just very well behaved, the boy does everything that you tell him to.*0383

*That is dependence because you can sort of control what that boy does, *0390

*by telling him to do something and he does it.*0394

*Imagine that your other twin boy is very mischievous.*0397

*He always does the opposite of what you tell him to do. *0401

*Whatever you tell him to do, he does the opposite.*0404

*If you tell him to go to bed, then he runs around and plays.*0407

*If you tell him to runaround and play, then he goes to bed.*0411

*You might think, that is a very independent child but that is not an independent child*0416

*because you can still control that child by using reverse psychology.*0422

*If you want him to go to bed, tell him to run around and play, and he will go to bed.*0426

*If you want him to run around and play, tell him to go to bed and he will run around and play.*0431

*You can still control that child because that child is still responding to your commands, *0436

*he is just responding in the opposite fashion.*0442

*You can still control that child, just by using reverse psychology.*0444

*That is dependence and that is kind of the situation that you want to think of here,*0448

*when you got two variables that move against each other.*0453

*If you had a child that, if you tell him to go to bed, sometimes he goes to bed and sometimes he runs around and plays,*0457

*that would be an independent child, that would be a child you could not control by any kind of psychology.*0463

*That is really what you want think of independence.*0470

*If you cannot control the actions then that would be independence.*0472

*But, if two variables move together, that is dependence.*0477

*If two variables move against each other consistently, that is still dependence.*0482

*Finally, let me show you how independence enters this picture.*0492

*The theorem is that if Y1 and Y2 are independent then their covariance is always going to be equal to 0.*0496

*Remember, covariance is the measure of dependence.*0504

*If they are independent then the covariance is 0.*0508

*Unfortunately, the converse of that theorem is not true.*0512

*You would like to say, if the covariance is 0 then the variables are independent.*0516

*That is not true, you can have covariance of two variables being 0 and that still have some dependence between the variables.*0521

*That is a rather unfortunate result of the mathematics, there is no way for me to fix that.*0533

*I will give you an example of that and that is coming up in the problems that we are about to do, in example 2.*0540

*If you just scroll down, scroll forward in this lecture, you will find an example*0549

*where the covariance is going to come out to be 0, and the Y1 and Y2 will still be dependent.*0556

*That is kind of unfortunate, it would be very nice if this theorem worked in both directions.*0564

*It does work in one direction but it does not work in the other direction.*0568

*One new concept for this lecture is the correlation coefficient.*0575

*This is very closely related to covariance.*0580

*In fact, if you are studying along in your book, they probably be mentioned in the same section with covariance in your book.*0583

*You start out with random variables, you calculate their covariance.*0590

*Remember, we learn about that a couple slides ago.*0594

*You can go back and look at the definition of covariance.*0597

*What you do is you just divide by their standard deviations.*0600

*The correlation coefficient is really just a scaled version of the covariance.*0604

*You just take the covariance and you scale it down, by a couple of constants.*0611

*The point of the correlation coefficient is that, if you did multiply each one of your variables by a constant *0615

*then the constant washes out of the definition of the correlation coefficient.*0624

*You end up with the same correlation coefficient, that you would have in the first place.*0630

*By the way, this Greek letter is pronounced ρ.*0636

*This is the Greek letter ρ, ρ of a scale version of the variables comes out to be the same ρ.*0639

*That is very nice, that means that ρ, the correlation coefficient is independent of scale.*0648

*It is convenient, if you are taking measurements.*0654

*It does not matter, if you are measuring in inches, feet, or meters, or whatever.*0657

*You are still going to get the same correlation coefficient.*0660

*In particular, the correlation coefficient will always be between -1 and 1.*0664

*It is an absolute constant that, you can discuss correlation coefficients between different sets of data.*0670

*You will always know that you are working on a scale between -1 and 1.*0678

*That is not true with covariance, covariance can be as big or as be negative as you can imagine.*0682

*But, correlation coefficient is always between -1 and 1, it is a sort of a universal scale.*0689

*There is going to be an example in this lecture where we will calculate the correlation coefficient.*0696

*At the end, we will actually translate into a decimal and we will just check if that it is between -1 and 1.*0701

*If it does come out to be between -1 and 1, then that is a little signal that we have probably done our work right.*0706

*If it comes out to be bigger than 1, we know that we have done something wrong.*0713

*The next topic that we need to learn about is linear functions of random variables.*0717

*Let us start out with a collection of random variables, Y1 through YN.*0723

*We have means or expected values, remember expected value and mean are the exact same thing.*0727

*Means are μ1 to μn and the variances are σ1² through σ N².*0735

*What we want do is build up this linear combination, this linear function A1 Y1 up to AN YN.*0743

*We are building a linear function out of the Yi.*0751

*We want to find the expected value of that construction.*0756

*It turns out to be just exactly what you would think and hope it would be, *0759

*which is just A1 × expected value of Y1 up to AN × the expected value of YN.*0764

*That is just because expectation is linear, it works very well and gives you what you would expect.*0772

*The variance is not so nice, it is a little bit trickier.*0780

*The variance of A1 Y1 up to AN YN, first of all, these coefficients get².*0784

*You have A1² and AN².*0792

*And then, the σ 1², remember that is the variance of Y1 up to σ N² is the variance of YN.*0796

*That is not all, there is another term on this formula which is that, *0809

*you have to look at all the covariances of all the variables with each other.*0813

*You look at all the covariances of the i and j, and for each pair, if you have Y1 and Y3 or Y2 and Y5,*0819

*for each one of those pairs, you take the coefficients of each one, the Ai and Aj.*0830

*You add them up and you multiply all these by 2.*0836

*The reason you are multiplying by 2 is because you are doing Y1 with Y3.*0839

*And then, later on you will be doing Y3 with Y1.*0844

*That is why we get that factor of 2 in there, it is because you get each pair in each order.*0848

*That is what you would get for the variance of a linear combination of random variables.*0857

*We will study some examples of that, so you got a chance to practice this.*0864

*There is one more formula which is, when you want to calculate the covariance of two linear combinations.*0868

*The covariance of A1 Y1 up to AN YN and B1 X1 up to BM XM, the covariance of those two things together.*0876

*It actually behaves very nicely, you just take the covariance of all the individual pairs.*0890

*You can factor out the coefficients and then you just add up that sum.*0896

*All the pairs, YI × XJ or covariance of YI with XJ, and then you just put on the coefficients Ai and B sub J.*0901

*The covariance behaves really quite nicely with respect to linear combinations.*0914

*That is a lot of background material, I hope that I have not lost you yet.*0920

*I want to jump in and we will solve some examples, and we will see how all these formulas play out in practice.*0925

*In example 1, we have got a joint density function.*0933

*This terminology might be a little unfamiliar to people who are just joining me.*0938

*That colon means defined to be.*0942

*We are defining the joint density to be and that = means always equal to.*0949

*It is like equal but it is sort of saying that, no matter what Y1 and Y2 are, this is always equal to 1.*0959

*Let us see, this is over the triangle with corners at -1, 0 and 0, 1 and 1,0.*0966

*I will go ahead and graph that because we are going to end up calculating some double integrals here.*0975

*Let me use the formulas that we learned in the previous lecture, on expected values of functions of random variables.*0981

*If you did not watch that lecture, you really want to go back and watch that lecture before this example will make sense.*0989

*There is the -1,0 and there is 0,1 and there is 1,0.*0999

*The region we are talking about here, let me go ahead and put my scales on here.*1007

*This is Y1 and this is Y2, there is -1, there is 1, there is 1, and 0.*1014

*This region that we are talking about is this triangular region, that is sort of a triangle here.*1023

*Since, I'm going to be using some double integrals to calculate these expected values,*1031

*I want to describe this region.*1036

*I think the best way to describe it, is by listing Y2 first and then listing Y1 as varying between these two lines.*1039

*Otherwise, I would have to chop this up into two separate pieces.*1049

*It is really more work than what I want to do.*1052

*Let me try to find the equation of those lines.*1055

*The sign back here is Y2 = Y1 + 1.*1057

*That is just following slope intercept form, the slope is 1 and Y intercept is 1.*1063

*It is like Y = X + 1.*1070

*If I solve for that in terms of Y1, that is Y1 = Y2 -1.*1072

*This line right here is, the slope Y2 is slope -1 - Y1 + 1.*1079

*If I solve for Y1, I get Y1 would be 1 - Y2, that is that line.*1089

*If I want to describe that region, I can describe it in terms of Y2 first.*1097

*Y2 goes from 0 to 1 and Y1 goes from that left hand diagonal line Y2 -1, to the right hand diagonal line which is 1 - Y2.*1101

*I got a description of the region, I need to set up some double integrals.*1121

*Let me set up a double integral for expected value of Y1.*1125

*All these double integrals will have the same bound, that is one small consolation.*1130

*To get the value of Y1, will be the integral as Y2 goes from 0 to 1.*1135

*Y1 goes from Y2 -1 to 1 - Y2, just following those limits there.*1142

*I’m calculating the expected value of Y1.*1155

*I will put Y1 here and I want to put the density function that is just 1.*1158

*DY1, that is the inside one, and DY2.*1165

*Notice here that, the function F = Y1 that is what I’m integrating.*1170

*That is positive on the right hand triangle and negative on the left hand triangle.*1176

*It is symmetric, and the shape of the region we are integrating over is symmetric too.*1181

*This double integral is equal to 0, by symmetry.*1188

*This thing is completely balanced around the Y2 axis.*1193

*I can work out that double integral, it is not a very hard double integral.*1200

*But, I'm feeling a little lazy today and I do not think I want to work that out.*1204

*I’m just going to say by symmetry it is equal to 0, if I did work out that integral.*1209

*It might not be a bad idea, if you are little unsure of this to work out the double integral and*1214

*really make sure that you get 0, so that you believe this, if it seems a little suspicious to you.*1219

*In the meantime, I will go ahead and calculate the expected value of Y2.*1226

*Same double integral, at least the same limits, Y2 = 0, Y2 = 1, and Y1 = Y2 -1, and Y1 = 1 - Y2.*1230

*I'm finding the expected value of Y2, I do Y2 × 1 DY1 DY2.*1248

*Y2 is not symmetric over this region, because it ranges from 0 to 1.*1258

*I can not get away with invoking symmetry, again.*1264

*I have to actually do this double integral.*1268

*I will go ahead and do it, it is not bad.*1273

*I'm integrating Y2 with respect to Y1.*1275

*Y2 is just a constant, that can give me Y2 × Y1.*1278

*I’m integrating that or evaluate that from Y1 = Y2 - -1 to Y1 = 1 - Y2.*1285

*I get Y2 × 1 - Y2 - Y2 -1 - (Y2-1).*1296

*That is Y2 ×, it looks like -2Y2 -2.*1314

*That is -2Y2² -2Y2².*1322

*That was all just solving the inside integral, I need to integrate that from Y2 = 0 to Y2 = 1.*1331

*This is all integrating DY2.*1349

*I think I screwed up a negative sign in there.*1357

*It is not -2, it is +2.*1361

*That +2, when I multiply it by Y2, I made a couple of mistakes in there.*1365

*That is 2Y2 - 2Y2², I think that is correct now.*1372

*Let me go ahead and integrate that from Y2 = 0 to Y2 = 1.*1379

*That looks like I’m spacing out on that line right there.*1386

*I think I got it right now.*1390

*I want to integrate that, the integral of 2Y2 is Y2².*1392

*The integral of 2Y2² is -2/3 Y2³.*1400

*I need to evaluate this from Y2 = 0 to Y2 = 1.*1410

*If I plug in Y2 = 1, I will get 1 -2/3, plug in Y2 = 0, I just get nothing.*1417

*I get 1 -2/3 is 1/3, that is my expected value for Y2.*1428

*I should have boxed in my answer for expected value of Y1, because that was my first answer up there.*1434

*I had a couple of hitches along the way there, but I think it worked out okay.*1441

*I still need to find the expected value of Y1 Y2.*1446

*Again, that is that same double integral Y2 = 0 to Y2 = 1 and Y1 = Y2 -1 to Y1 = 1 - Y2.*1452

*Where is my function, it is Y1 × Y2 × the density function which is just 1 of DY1 DY2.*1470

*If that is getting a little cut off there, then I will just say that that last symbol was DY2.*1483

*In case, you have trouble reading that.*1489

*This is not as bad as it seems, because the function that I’m integrating Y1 Y2, *1493

*it is going to be positive on the right hand triangle and negative on the left hand triangle.*1500

*They are exactly evenly balanced, is symmetric on this triangle.*1506

*That means that whole integral will come out to 0, by symmetry.*1521

*That is because, I had a function that is positive on the right hand part and negative on the left part.*1528

*If that feels a little suspicious to you, go ahead and do the integral.*1534

*It will be a little messy, it is likely the most fun integral in the world but it is possible.*1538

*It is just tedious algebra, there is really nothing too dangerous in there.*1542

*You can do the integral and you should get 0, at the end.*1547

*It should agree with what I got by symmetry.*1550

*That completes that problem, we will be using the same setup and the same values for example 2.*1555

*I want to make sure that you understand everything we did here,*1565

*because I’m going to take these answers and I’m going to use them for example 2.*1568

*Just to make sure that you are very comfortable with all this stuff,*1574

*before we move on and use these answers in example 2, let me recap the steps here.*1577

*I wanted to look at this triangle.*1583

*First of all, I graphed this triangle based on those 3 corner points that were given to me.*1586

*I set up this triangle and colored in the region here.*1592

*I know that I’m going to be doing a double integral, I was trying to describe the limits of this triangle.*1597

*If I list Y2 first then I can do it just by doing one double integral.*1606

*If I list Y1 first, then I have to chop this thing into two.*1612

*That is why I listed Y2 first going from 0 to 1.*1616

*And then, I had to describe Y1 in terms of these two lines.*1619

*The way I got those two lines was, I found the equations of these two diagonal lines on the sides of the triangles.*1624

*Then, I solve each one for Y1.*1632

*That is where those limits right there came from, in terms of Y1.*1635

*That let me set up my limits of integration on each of my three integrals.*1639

*Those sets of limits, I use the same sets of limits on each one of those integrals.*1644

*Each one of those integrals, I integrated something different.*1655

*They all had this 1 in them, and that 1 came from this one right here, the density functions.*1658

*That 1 manifested itself there, there, and there.*1664

*I had the 1, then each one I was integrating something different because *1669

*I was trying to find the expected value of something different.*1672

*Expected value of Y1, integrate Y1.*1676

*Expected value of Y2, integrate Y2.*1678

*Expected value of Y1 Y2, integrate Y1 Y2.*1682

*I know I have set up three integrals and two of them, I notice that the function I’m integrating is symmetric,*1687

*in the sense that Y1 is going to be positive over here and negative over on this region.*1694

*It exactly bounces each other out, I know that I’m going to get 0, if I do that integral.*1701

*If you do not believe that, just do the integral, it would not be that hard.*1707

*It should work out to be 0.*1710

*Same thing over here, Y1 Y2 is positive in the positive region, negative in the second quadrant.*1711

*It is going to give me 0, by symmetry.*1720

*That is where I got these two 0’s from.*1722

*If you do not like it, just do the integrals and you can work them out yourself.*1725

*The expected value of Y2, Y2 is positive on both of those triangles.*1729

*I cannot use symmetry, I actually have to do the integral.*1734

*I integrated with respect to Y1, I got Y2 Y1.*1738

*Plugged in my limits, I got an integral in terms of Y2.*1741

*Integrated that, plug in my limits and got the expected value of Y2 to be 1/3.*1746

*Hang onto these 3 answers, we are using it again right away in example 2.*1752

*In example 2, we have the same setup from example 1.*1758

*Let me go ahead and draw out that setup.*1762

*It was this triangle with corners at -1,0 and 0,1 and 1,0.*1764

*Let me draw that triangle, it is the same thing we had in example 1 and the same density function.*1773

*A joint density function is always equal to 1 over that region.*1782

*What we want to do here is, we want to calculate the covariance of Y1 of Y2.*1789

*After that, we are going to ask whether Y1 and Y2 are independent.*1795

*I’m going to use that formula that I gave you for covariance of Y1 and Y2.*1799

*Not the original definition which I think is not very useful, is often cumbersome if you use the original definition of covariance.*1804

*We have this great formula for covariance.*1813

*What covariance of Y1 and Y2, is equal to the expected value of Y1 × Y2 - the expected value of Y1 × the expected value of Y2.*1815

*You cannot necessarily separate the variables, it is not necessarily true that*1831

*the expected value of Y1 × Y2 is equal to the expected value of Y1 × the expected value of Y2.*1836

*The reason this formula is useful for us right now is,*1845

*we already figured out each one of these quantities in example 1.*1848

*The work here, it was quite a bit of work was done here in example 1.*1856

*If you did not just watch example 1, if example 1 is not totally fresh in your mind, just go back and watch it right now.*1862

*You will see where we work out all of these quantities individually.*1869

*The expected value of Y1 × Y2 was 0.*1874

*The expected value of Y1 was also 0.*1878

*The expected value of Y2 was 1/3.*1881

*That is what we did in example 1.*1884

*This all simplifies down to be 0 here.*1886

*That is our answer for the covariance, the covariance is 0.*1890

*If Y1 and Y2 are independent, this is a very subtle issue here because you see that*1896

*the covariance being 0 and you might think of independence, that is the converse of our theorem. *1902

*Our theorem says that, if they are independent then the covariance is 0.*1909

*It does not work the other way around.*1915

*We cannot say necessarily that they are independent yet.*1918

*In fact, I want to remind you of a theorem that we had back in another lecture on independent variables.*1921

*From the lecture on independent variables, that was several lectures ago.*1934

*You can go back and check this out, if you do not remember it.*1942

*We have a theorem that the region must be a rectangle.*1945

*And then, there was another condition that the joint density function must factor into two functions.*1964

*One just of Y1 and one just of Y2.*1975

*If both of those conditions were satisfied then the variables are independent, that was if and only if.*1981

*In this case, we do not have a rectangle.*1989

*This is not a rectangle, it is a triangle.*1993

*That theorem tells us that Y1 and Y2 are not independent.*2004

*That should kind of agree with your intuition, if you look at the region, if I tell you for example that Y1 is 0.*2018

*Let me put some variables on here.*2029

*This is Y1 on the horizontal axis and this is Y2 on the vertical axis.*2031

*If I tell you that Y1 is 0 that means we are on this vertical axis, then Y2 could be anything from 0 to 1.*2036

*If I tell you that Y1 is ½ then Y2 cannot be anything from 0 to 1, it could be only as big as ½.*2053

*By changing the values of Y1, I’m changing the possible range of Y2, *2064

*which suggests that knowing something about Y1 would give me new information about Y2.*2071

*That is the intuition for dependents, for variables not being independent.*2077

*You should suspect, by looking at that region, that the variables are not independent.*2086

*This theorem from that old lecture confirms it.*2092

*Just to recap here, we are asked to find the covariance.*2097

*I'm using the formula that I gave you for covariance.*2101

*I think it is on the first or second slide of this lecture, just scroll back and you will see it,*2105

*the definition and formulas for covariance.*2111

*It expands out into these three expected values.*2113

*We calculated all these in example 1.*2117

*I just grabbed the old values from example 1 and simplify down, just got 0,*2119

*which might make you think independent because we had this theorem that if they are independent, then the covariance is 0.*2125

*The converse of that theorem is not true.*2133

*This is kind of the classical example of that.*2137

*Variables can have a covariance equal to 0 and not be independent.*2141

*This is really what this example is showing, not be independent.*2157

*It is true that if their independent then the covariance is 0, but they can have covariance 0 and in this case,*2161

*since the region is not a rectangle, they are not independent.*2168

*In example 3, we have independent random variables.*2176

*We have been given their means and their variances.*2179

*We are given a couple of linear functions U1 is Y1 + Y2 and U2 is Y1 - Y2.*2182

*We want to calculate the variance of U1 and U2.*2190

*Let me show you how those work out.*2195

*The variance of U1, U1 is, by definition is Y1 + 2Y2.*2197

*I'm going to use the theorem that we had on linear functions of random variables.*2212

*This was on the introductory slides, you can go back and just read back about that theorem, *2221

*about linear functions of random variables.*2229

*What that told me is that, a linear combination of random variables distributes out *2232

*but you write the coefficients 1² × σ 1², the variance of Y1.*2238

*Let me write that as the variance of Y1, 1² × the variance of Y1 + 2² × the variance of Y2.*2247

*And then, there is this other term which is kind of obnoxious but we have to deal with it.*2257

*It is 2 × the coefficient 1 × 2 × the variance of Y1 with the covariance of Y1 with Y2.*2261

*That is by the theorem from the beginning of this.*2281

*I think the title is linear functions of random variables.*2290

*That was the theorem that we had.*2296

*Let me plug in, 1² is just 1, the variance of Y1 we are given is 4 + 2² is 4.*2299

*The variance of Y2 was given to be 9 + 4.*2306

*The covariance of Y1 × Y2, what we are given is that Y1 and Y2 are independent.*2313

*If they are independent, then the covariance is 0.*2320

*Remember, in example 2, we learned the converse is not true.*2324

*But, it is true that if they are independent then the covariance is 0.*2326

*That right there is by independence and by the theorem that we learned earlier on in this lecture.*2331

*We just simplify, 4 + 4 × 9 is 4 + 36 which is 40.*2341

*That is the variance of U1, the variance of U2 is, by definition U2 is Y1 - Y2.*2349

*I will just do the same thing, I’m going to expand it out using that theorem.*2360

*We got 1Y1, 1² × the variance of Y1 + -1² × the variance of Y2, I’m using that theorem, *2365

*you have to square the coefficients.*2375

*+ 2 × 1 × -1, those are the coefficients × the covariance of Y1 with Y2 and 1.*2378

*We just get 1 × the variance of Y1 which were given is 4 +, because -1² gives us a positive then Y2 is 9.*2390

*Again, the covariance of Y1 Y2, by independence is 0.*2403

*This just simplifies down to 13 is the variance of U2, and we are done.*2408

*To recap the steps there, the variance of U1, expand that out.*2416

*U1 was defined to be Y1 + 2Y2.*2420

*We had this theorem on linear functions of random variables.*2425

*Go back and check it out, it told us how to find the variance of a linear combination.*2428

*You kind of expand out the variances, you have to square the coefficients.*2435

*That 1² comes from the sort of a hidden 1 right there, and that 2² comes from there.*2439

*This 2 comes from the theorem.*2448

*This 1 and this 2 come from the coefficients, there and there.*2451

*This covariance comes from the theorem as well.*2456

*We are given that Y1 and Y2 are independent, independence tells us that their covariance is equal to 0.*2464

*That was a theorem that we also had earlier on in this lecture.*2471

*The variance of Y1 is 4, that came from here.*2476

*The variance of Y2 is 9, that came from the stem of the problem here.*2481

*We drop those in, simplify down, and we get 4 + 4 × 9 is 40.*2486

*The variance of U2 works exactly the same way, except that there are coefficients now.*2493

*Instead of being 1 and 2, are 1 and -1.*2498

*The covariance drops out because they are independent.*2501

*We get 4 and 9, remember the -1 squares because you always square the coefficients with variance.*2505

*That makes a positive and 4 + 9 is 13.*2512

*By the way, we are going to use these values again, in example 4.*2517

*Make sure you understand these values very well, before you move on to example 4.*2525

*We will need them again.*2531

*Example 4, we are going to carry on with the same sets of data that we had from examples 3.*2534

*Y1 Y2 are independent variables, we have been given their means and their variances.*2540

*We are going to let U1 and U2 be these linear combinations.*2547

*I want to find the covariance of U1 and U2.*2551

*And then, I'm also going to find this ρ here, is the correlation coefficient.*2557

*We learn about that, at the beginning of this lecture.*2561

*Let me expand out the covariance of U1 and U2.*2565

*U1 and U2 is the covariance of Y1 + 2Y2 and Y1 - Y2, that is just the definition of U1 and U2.*2569

*And then, there was a theorem that we had to back at the beginning of this lecture *2584

*on how you can expand covariances of combinations of a random variables.*2590

*It says, it expands out linearly so it kind of expands out.*2598

*You can almost think of foil, first outer, inner last.*2604

*It is the covariance of Y1 with Y1 - the covariance of Y1 with Y2 + 2 × the covariance of Y2 with Y1.*2608

*I’m just expanding it out, first outer and inner last, -2 × the covariance of Y2 with Y2.*2629

*It is just expanding out each term with each term.*2640

*We are going to use a couple other facts here.*2645

*Remember that, the covariance on Y1 with itself is just the variance of Y1.*2647

*That was something we learned on the very first slide of this lecture, *2653

*it was the one where we introduced covariance.*2660

*I gave you, first the definition of covariance and a couple useful formulas.*2663

*One useful formula was the covariance of a variable itself is just the variance.*2667

*We also have theorem that said, if Y1 Y2 are independent and they are given to be independent here,*2672

*then the covariance is 0.*2679

*This is 0 by independence.*2681

*This covariance is also 0, by independence.*2689

*Covariance of Y2 with itself is the variance of Y2, that same formula that we had on the first slide.*2693

*This is the variance of Y1 -2 × the variance of Y2.*2703

*Where is the variance of Y1, there it is, it is 4 -2 × the variance of Y2 is 9.*2708

*That is 4 -18 which is -14, that is our covariance.*2717

*We also have to find the correlation coefficient.*2726

*Let me remind you that the correlation coefficient, how that is defined.*2729

*The correlation coefficient, ρ of U1 U2, by definition is the covariance of U1 with U2*2734

*divided by the standard deviation of U1 and the standard deviation of U2.*2750

*I’m going to use σ for the standard deviation of U1 × the standard deviation of U2.*2758

*That σ is the σ for U1, it is not the σ for Y1.*2764

*It is not the σ 1 right here, that σ U2 is not the σ 2 there.*2769

*That was a mistake that I accidentally made, when I was making a rough draft of these notes.*2776

*Do not make the same mistake I did.*2781

*The covariance of U1 U2 is –14, we just figure that out.*2785

*The standard deviation of U1, I’m going to invoke what I figured out on the previous example.*2790

*The standard deviation of U1 is just the square root of the variance of U1, and the same for U2, *2801

*the square root of the variance.*2810

*The standard deviation is always the square root of the variance, that is the definition.*2811

*I figured out the variance of U1 and U2, in the previous example, in examples 3.*2817

*I’m trying to look up those values right there.*2826

*We work this out in examples 3.*2831

*We solved -14 in the numerator.*2839

*In example 3, we figure out that the variance of U1 was 40, we have √ 40.*2841

*The variance of U2 was 13, that is what we figured out in example 3.*2848

*If you did not just watched example 3 and you think those numbers appeared magically,*2856

*go back and watch example 3, and you will see where they came from.*2860

*This does not get much better, this is -14, √ 40 we can pullout a 4, make that 2 √ 10.*2864

*√ 13 in the denominator and that turns into, if we cancel 2, we get -7/10 × 13 is 130.*2874

*Not a particularly nice number, I could not find figure out a way rake up these numbers to behave nicely.*2886

*I did throw this into a calculator and what did I get there, when I plugged in a calculator.*2892

*This was 0 × 0.614, actually that was an approximation and it is negative.*2902

*That is what I got when I plug that number into a calculator, nothing very revealing there.*2911

*Let me mention that one thing we knew about the correlation coefficient, *2917

*I gave you this way back on the third slide of this lecture.*2923

*When I talked about correlation coefficient is that, the whole point of correlation coefficient is its scale independent.*2929

*It is always between -1 and 1, and this is between -1 and 1 because it is -0.6.*2940

*That is slightly reassuring, if it had been outside of that range then*2949

*I would have known that I have made a mistake, somewhere through here.*2955

*The fact that I got a number in between -1 and 1, does not guarantee that right *2958

*but it makes me feel a little more confident in my work here.*2962

*We got answers for both of those, let me show you how I got those.*2966

*The covariance of U1 and U2, I expanded out the definition of U1 and U2 which is Y1 + 2Y2 and Y1 -Y2.*2970

*We had this theorem on one of the introductory slides to this lecture which said, *2982

*how you can expand out covariances of linear combinations.*2987

*It is very well behaved, it just expands out the same way you would multiply together binomial.*2991

*You can think of foiling things together.*2996

*We did the first Y1 and Y1, the outer Y1 and Y2, subtracted because of the coefficient there.*2999

*Let me write out foil here, if you remember your high school algebra, first outer, inner last.*3008

*The inner term is 2Y2 × Y1, the covariance of it and the last term is -2Y2 Y2.*3013

*These mixed terms, the Y1 and Y2, remember they were given that the variables are independent.*3023

*Independent variables have covariance 0.*3030

*That is not true in the other direction, just because they have covariance 0, does not mean they are independent.*3033

*If they are independent, then the covariance is definitely 0, which is great, those two terms dropout.*3039

*The other thing we learned by an early formula, when I first taught you about covariance*3045

*is that the covariance of Y1 with itself is just the variance of Y1 and same thing for Y2.*3052

*I can just drop in my value for the variance, there it is right there.*3061

*The variance of Y1 is 4, variance of Y2 is 9, and drop those in and simplifies down to -14.*3064

*The correlation coefficient ρ, by definition, is the covariance of those two variables divided by their standard deviations.*3074

*I just figure out the covariance, that is the -14.*3083

*The standard deviations are always the square root of their variances, that is the definition of standard deviation.*3086

*Take their variance and take the square root.*3093

*The variance of U1 and U2, that is what I calculated back in example 3.*3096

*Just go back and watch example 3, you will see where these numbers 40 and 13 are coming from.*3103

*It is not these numbers right here, the σ 1² and σ 2² because those were the variances for Y1 and Y2.*3109

*Here, we want the variances of U1 and U2.*3118

*Once, I got those numbers in there, I reduce the square roots a little bit, but it did not end up being a very nice number.*3122

*The reassuring thing was that when I found the decimal, it was between -1 and 1, *3129

*which is the range that a correlation coefficient should always landed.*3135

*Last example here, we got independent variables but they all have the same mean and the same variance.*3141

*We want to find the mean and the variance of the average of those variables.*3148

*The average just means, you add them up and divide by the number of variables you have.*3152

*It is 1/N Y1 up to 1/N YN.*3158

*I wrote it that way to really suggest that, that is a linear combination of the original random variables.*3162

*This is a linear function and we can use our theorem on how you calculate means and variances of linear combinations.*3169

*That is what I'm going to use.*3183

*The expected value is the same as the mean.*3185

*Remember, the expected value of Y bar, the mean and variance of the average.*3188

*The expected value of Y bar is just the expected value of 1/N Y1 up to 1/N YN.*3195

*I can distribute by linearity of expectation, that was a theorem that we had.*3208

*I can distribute and pull out those coefficients 1/N × E of Y1 up to 1/N × E of YN.*3217

*That is 1/N × μ up to 1/N × μ.*3231

*They all have the same mean μ.*3236

*If you add up N copies of 1/N × μ, you just get a single copy of μ.*3239

*That is my expected value of the average.*3249

*The variance of the average, variance of Y bar, again is the variance of 1/N Y1 + up to 1/N YN.*3252

*Variance is not linear, expectation is linear.*3265

*There is a nastier theorem that tells you what to do with variance.*3270

*I gave you that theorem in one of the introductory slides.*3277

*I said linear functions of random variables.*3280

*The way this works is, you pull out the coefficients but you square them.*3284

*1/N² × the variance of Y1 up to 1/N² × the variance of YN.*3290

*There is the cross terms, there is this cross term which is 2 × the sum as i is bigger than j *3300

*of the coefficients 1/N × 1/N × the covariance of YI with YJ.*3310

*That looks pretty dangerous there but, let us remember that we are given that we have independent variables.*3321

*Any Y and J, if you take their covariance, since they are independent, this covariance will be 0, by independence.*3331

*That is a really nice, that means I can just focus on the first few terms here, 1/N²*3342

*the variance of Y1 is σ² + up to 1/N² × σ².*3350

*Let me write that a little more clearly, that is 1/N² in the denominator there.*3360

*What I have is N terms here of 1N² × σ².*3368

*That simplifies down to σ²/N.*3374

*By the way, this is a very fundamental result in statistics.*3379

*This is something that you use very often, as you get into statistics.*3385

*The variance of the mean is equal to σ² divided by N.*3390

*This is where it comes from, this is where the magic starts to happen is right here with this example.*3395

*Let me make sure that you understand every step here.*3402

*We want to find the expected value of Y bar.*3405

*Remember that, Y bar the average is just can be written as a linear combination of these variables.*3408

*Expectation is linear, that is what we learned in that theorem.*3415

*You can just separate it out into the expected values of Y1 up to YN, then just pull out the constants.*3419

*And then, each one of those E of Yi is μ because we are given that in the problem right there.*3426

*We are adding up N copies of 1/N × μ.*3437

*At the end, we just get a whole μ.*3441

*The variance is a little bit messier, the variance of a linear combination.*3443

*Again, you can split up into all the separate variances but when you pull it out, pull out the coefficients, it get².*3450

*That is why we get 1/N² on each of these coefficients.*3457

*There is this cross term, 2 × the sum of the coefficients.*3462

*That is a little messy there but that is 1/N × 1/N.*3467

*That is coming from these coefficients right here, the covariance of Yi × Yj.*3471

*The fortunate thing is that, we have a theorem that says when two variables are independent, their covariance is 0.*3480

*Converse of that is not true, it could be covariance is 0 without independence.*3488

*But if they are independent, their covariance is definitely 0.*3493

*We are given that they are independent here.*3497

*All those cross terms dropout and we are just left with N copies of 1/N².*3500

*We are given that the variance of each individual variable σ².*3507

*N × 1/N² is just 1/N, we still have that σ².*3512

*The variance of the mean is σ²/N.*3518

*Essentially, this means if you take the average of many things, *3522

*it is not going to vary as much as individual members of the population will,*3525

*because the variance shrinks down, as you take a larger and larger sample.*3531

*That is a very classic result in statistics, now you know where it comes from.*3537

*Now, you know where this classic formula comes from.*3542

*That wraps up our lecture, a kind of a big one today on correlation and covariance, and linear functions of random variables.*3546

*This is all part of the chapter on Bivariate distribution functions and Bivariate density functions.*3554

*In fact, this wraps up our chapter on Bivariate density functions and distribution functions.*3562

*We will come back later and talk about distributions of random variables.*3568

*We still have one more chapter to go.*3573

*In the meantime, it is nice to finish our chapter on Bivariate density and distribution functions.*3575

*This is all part of the probability lecture series here on www.educator.com.*3581

*I'm your host and guide, my name is Will Murray, thank you for joining me today, bye now.*3586

1 answer

Last reply by: Dr. William Murray

Tue Jun 17, 2014 12:33 PM

Post by Carl Scaglione on June 13, 2014

Professor Murray, On page 5, referring to the last equation, the summation terms are i>j. Why not show i not equal to j?

Respectfully,

Carl