Sign In | Subscribe
Start learning today, and be successful in your academic & professional career. Start Today!
Loading video...
This is a quick preview of the lesson. For full access, please Log In or Sign up.
For more information, please see full course syllabus of Probability
  • Discussion

  • Study Guides

  • Download Lecture Slides

  • Table of Contents

  • Transcription

Bookmark and Share
Lecture Comments (2)

1 answer

Last reply by: Dr. William Murray
Tue Jun 17, 2014 12:33 PM

Post by Carl Scaglione on June 13, 2014

Professor Murray, On page 5, referring to the last equation, the summation terms are i>j.  Why not show i not equal to j?


Covariance, Correlation & Linear Functions

Download Quick Notes

Covariance, Correlation & Linear Functions

Lecture Slides are screen-captured images of important points in the lecture. Students can download and print out these lecture slide images to do practice problems as well as take notes while watching the lecture.

  • Intro 0:00
  • Definition and Formulas for Covariance 0:38
    • Definition of Covariance
    • Formulas to Calculate Covariance
  • Intuition for Covariance 3:54
    • Covariance is a Measure of Dependence
    • Dependence Doesn't Necessarily Mean that the Variables Do the Same Thing
    • If Variables Move Together
    • If Variables Move Against Each Other
    • Both Cases Show Dependence!
  • Independence Theorem 8:10
    • Independence Theorem
    • The Converse is Not True
  • Correlation Coefficient 9:33
    • Correlation Coefficient
  • Linear Functions of Random Variables 11:57
    • Linear Functions of Random Variables: Expected Value
    • Linear Functions of Random Variables: Variance
  • Linear Functions of Random Variables, Cont. 14:30
    • Linear Functions of Random Variables: Covariance
  • Example I: Calculate E (Y₁), E (Y₂), and E (Y₁Y₂) 15:31
  • Example II: Are Y₁ and Y₂ Independent? 29:16
  • Example III: Calculate V (U₁) and V (U₂) 36:14
  • Example IV: Calculate the Covariance Correlation Coefficient 42:12
  • Example V: Find the Mean and Variance of the Average 52:19

Transcription: Covariance, Correlation & Linear Functions

Hi, welcome back to the probability lectures here on

We are working through a chapter on Bivariate distribution functions and density functions, 0004

which means that there are two variables, there is a Y1 and Y2.0010

In this section, we are also going to have sometimes more than two variables that might be N variables.0016

We got a big section to cover today, it is going to cover covariance and correlation coefficient, 0022

and linear functions of random variables.0029

I will be your guide today, my name is Will Murray, let us jump right on in.0032

The main idea of this section is covariance, the correlation coefficient is not something that is quite as important.0040

Let me jump right on in with the covariance.0046

The definition of covariance is not necessarily very enlightening.0048

Let me go ahead and show you the definition, but then I’m going to skip quickly to some formulas 0054

that are probably more useful in dealing with covariance.0059

In the next slide, I will try to give you an intuition for what covariance means.0063

The definition of the covariance is that, you will have to start with two random variables.0068

You always have a Y1 and Y2.0075

You always talk about the covariance of two random variables at once.0078

By definition, it means the expected value of Y1 - μ1 and what × Y2 - μ2.0082

Here, μ1 and μ2 are the means or the expected values of Y1 and Y2.0091

I think that, that definition does not offer a lot of intuitive light onto what covariance means.0098

I will talk about the intuition, maybe on the next slide.0105

In the meantime, I will give you some useful formulas for calculating covariance, 0108

because that definition is also not very useful for calculating covariance.0113

Often, the easiest way to calculate covariance is to use this formula right here, where you calculate the expected value of Y1 × Y2.0117

And then, you subtract off expected value of Y1 × the expected value of Y2.0128

That is usually the easiest way to calculate it.0134

By the way, for each one of these, you are going to have to calculate the expected value of a function of random variables.0136

We learned how to do that, in the previous lecture.0144

If you are not sure how you would calculate the expected value for example of Y1 × Y2, 0147

what you want to do is watch the previous lecture here on the probability series on

I went through some examples where we practiced calculating things like that.0161

You can see how you would calculate that.0166

A useful point here is that, if you ever have to calculate the covariance of Y1 with itself, 0169

it is exactly equal to the variance of Y1.0176

If you have to calculate the covariance of any single variable with itself, it is just the same as the variance.0180

The way covariance behaves under scaling is that, if you multiply these variables by constants0186

then those constants just pop out, and you just get C² coming out to the outside.0193

That is a very useful if you have to deal with linear functions which is something 0198

we are going to talk about later on, in this lecture.0203

That is the definition, which I do not really recommend that you use that often.0208

The definition for me is not useful, when calculating covariance.0212

These formulas are much more useful, specially this first one, it is the one that I used all the time,0216

when I’m calculating covariance.0221

That is definitely worth committing to memory.0223

I have not really told you yet, what covariance means when you are measuring.0227

Let me spend the next slide talking about that.0233

The intuition for covariance is that, it is a measure of dependents between your two variables Y1 and Y2.0236

It measures how closely Y1 and Y2 track each other.0245

There is an easy mistake that students make, when you are first learning about probability, 0254

which is to think that dependence,0259

If two variables are dependent that means that they do the same thing.0261

That is not quite what dependence means, what dependence means is that knowing about what one variable does,0265

gives you more information about what the other variable does.0273

It does not necessarily mean that they move together.0278

It just means that one variable gives you some kind of guide, as to what the other variable is doing.0281

The way that behaves, in terms of covariance, is that if the variables do move together, 0287

then covariance will be positive.0294

It shows that those variables are positively correlated.0296

If one is big then you expect the other one to be big.0300

If Y1 moves consistently against Y2, which means Y1 is big and the other one is small.0305

Or if the first one is small the other one is big.0312

They move like this, they move against each other, that is still dependence.0314

That would be reflected in the covariance, you will get a negative value for the covariance.0322

It means these variables are negatively correlated against each other.0327

Let me emphasize here that both of these are still considered to be dependence, to be examples of dependence.0331

Both of these show dependence.0341

And that is sometimes a little confusing for students, when you are first learning about probability.0349

Both of these are examples of dependence.0354

You think, wait a second, if two variables are moving against each other, are not those independent, 0357

that kind of make you think of your intuition as being independent, if they are moving against each other.0363

Not so, that is dependent, if they are moving against each other.0368

The example that I use with my own students is, to imagine that you are a parent and0373

you have 2 twin children, may be 2 twin boys.0378

One of your children is just very well behaved, the boy does everything that you tell him to.0383

That is dependence because you can sort of control what that boy does, 0390

by telling him to do something and he does it.0394

Imagine that your other twin boy is very mischievous.0397

He always does the opposite of what you tell him to do. 0401

Whatever you tell him to do, he does the opposite.0404

If you tell him to go to bed, then he runs around and plays.0407

If you tell him to runaround and play, then he goes to bed.0411

You might think, that is a very independent child but that is not an independent child0416

because you can still control that child by using reverse psychology.0422

If you want him to go to bed, tell him to run around and play, and he will go to bed.0426

If you want him to run around and play, tell him to go to bed and he will run around and play.0431

You can still control that child because that child is still responding to your commands, 0436

he is just responding in the opposite fashion.0442

You can still control that child, just by using reverse psychology.0444

That is dependence and that is kind of the situation that you want to think of here,0448

when you got two variables that move against each other.0453

If you had a child that, if you tell him to go to bed, sometimes he goes to bed and sometimes he runs around and plays,0457

that would be an independent child, that would be a child you could not control by any kind of psychology.0463

That is really what you want think of independence.0470

If you cannot control the actions then that would be independence.0472

But, if two variables move together, that is dependence.0477

If two variables move against each other consistently, that is still dependence.0482

Finally, let me show you how independence enters this picture.0492

The theorem is that if Y1 and Y2 are independent then their covariance is always going to be equal to 0.0496

Remember, covariance is the measure of dependence.0504

If they are independent then the covariance is 0.0508

Unfortunately, the converse of that theorem is not true.0512

You would like to say, if the covariance is 0 then the variables are independent.0516

That is not true, you can have covariance of two variables being 0 and that still have some dependence between the variables.0521

That is a rather unfortunate result of the mathematics, there is no way for me to fix that.0533

I will give you an example of that and that is coming up in the problems that we are about to do, in example 2.0540

If you just scroll down, scroll forward in this lecture, you will find an example0549

where the covariance is going to come out to be 0, and the Y1 and Y2 will still be dependent.0556

That is kind of unfortunate, it would be very nice if this theorem worked in both directions.0564

It does work in one direction but it does not work in the other direction.0568

One new concept for this lecture is the correlation coefficient.0575

This is very closely related to covariance.0580

In fact, if you are studying along in your book, they probably be mentioned in the same section with covariance in your book.0583

You start out with random variables, you calculate their covariance.0590

Remember, we learn about that a couple slides ago.0594

You can go back and look at the definition of covariance.0597

What you do is you just divide by their standard deviations.0600

The correlation coefficient is really just a scaled version of the covariance.0604

You just take the covariance and you scale it down, by a couple of constants.0611

The point of the correlation coefficient is that, if you did multiply each one of your variables by a constant 0615

then the constant washes out of the definition of the correlation coefficient.0624

You end up with the same correlation coefficient, that you would have in the first place.0630

By the way, this Greek letter is pronounced ρ.0636

This is the Greek letter ρ, ρ of a scale version of the variables comes out to be the same ρ.0639

That is very nice, that means that ρ, the correlation coefficient is independent of scale.0648

It is convenient, if you are taking measurements.0654

It does not matter, if you are measuring in inches, feet, or meters, or whatever.0657

You are still going to get the same correlation coefficient.0660

In particular, the correlation coefficient will always be between -1 and 1.0664

It is an absolute constant that, you can discuss correlation coefficients between different sets of data.0670

You will always know that you are working on a scale between -1 and 1.0678

That is not true with covariance, covariance can be as big or as be negative as you can imagine.0682

But, correlation coefficient is always between -1 and 1, it is a sort of a universal scale.0689

There is going to be an example in this lecture where we will calculate the correlation coefficient.0696

At the end, we will actually translate into a decimal and we will just check if that it is between -1 and 1.0701

If it does come out to be between -1 and 1, then that is a little signal that we have probably done our work right.0706

If it comes out to be bigger than 1, we know that we have done something wrong.0713

The next topic that we need to learn about is linear functions of random variables.0717

Let us start out with a collection of random variables, Y1 through YN.0723

We have means or expected values, remember expected value and mean are the exact same thing.0727

Means are μ1 to μn and the variances are σ1² through σ N².0735

What we want do is build up this linear combination, this linear function A1 Y1 up to AN YN.0743

We are building a linear function out of the Yi.0751

We want to find the expected value of that construction.0756

It turns out to be just exactly what you would think and hope it would be, 0759

which is just A1 × expected value of Y1 up to AN × the expected value of YN.0764

That is just because expectation is linear, it works very well and gives you what you would expect.0772

The variance is not so nice, it is a little bit trickier.0780

The variance of A1 Y1 up to AN YN, first of all, these coefficients get².0784

You have A1² and AN².0792

And then, the σ 1², remember that is the variance of Y1 up to σ N² is the variance of YN.0796

That is not all, there is another term on this formula which is that, 0809

you have to look at all the covariances of all the variables with each other.0813

You look at all the covariances of the i and j, and for each pair, if you have Y1 and Y3 or Y2 and Y5,0819

for each one of those pairs, you take the coefficients of each one, the Ai and Aj.0830

You add them up and you multiply all these by 2.0836

The reason you are multiplying by 2 is because you are doing Y1 with Y3.0839

And then, later on you will be doing Y3 with Y1.0844

That is why we get that factor of 2 in there, it is because you get each pair in each order.0848

That is what you would get for the variance of a linear combination of random variables.0857

We will study some examples of that, so you got a chance to practice this.0864

There is one more formula which is, when you want to calculate the covariance of two linear combinations.0868

The covariance of A1 Y1 up to AN YN and B1 X1 up to BM XM, the covariance of those two things together.0876

It actually behaves very nicely, you just take the covariance of all the individual pairs.0890

You can factor out the coefficients and then you just add up that sum.0896

All the pairs, YI × XJ or covariance of YI with XJ, and then you just put on the coefficients Ai and B sub J.0901

The covariance behaves really quite nicely with respect to linear combinations.0914

That is a lot of background material, I hope that I have not lost you yet.0920

I want to jump in and we will solve some examples, and we will see how all these formulas play out in practice.0925

In example 1, we have got a joint density function.0933

This terminology might be a little unfamiliar to people who are just joining me.0938

That colon means defined to be.0942

We are defining the joint density to be and that = means always equal to.0949

It is like equal but it is sort of saying that, no matter what Y1 and Y2 are, this is always equal to 1.0959

Let us see, this is over the triangle with corners at -1, 0 and 0, 1 and 1,0.0966

I will go ahead and graph that because we are going to end up calculating some double integrals here.0975

Let me use the formulas that we learned in the previous lecture, on expected values of functions of random variables.0981

If you did not watch that lecture, you really want to go back and watch that lecture before this example will make sense.0989

There is the -1,0 and there is 0,1 and there is 1,0.0999

The region we are talking about here, let me go ahead and put my scales on here.1007

This is Y1 and this is Y2, there is -1, there is 1, there is 1, and 0.1014

This region that we are talking about is this triangular region, that is sort of a triangle here.1023

Since, I'm going to be using some double integrals to calculate these expected values,1031

I want to describe this region.1036

I think the best way to describe it, is by listing Y2 first and then listing Y1 as varying between these two lines.1039

Otherwise, I would have to chop this up into two separate pieces.1049

It is really more work than what I want to do.1052

Let me try to find the equation of those lines.1055

The sign back here is Y2 = Y1 + 1.1057

That is just following slope intercept form, the slope is 1 and Y intercept is 1.1063

It is like Y = X + 1.1070

If I solve for that in terms of Y1, that is Y1 = Y2 -1.1072

This line right here is, the slope Y2 is slope -1 - Y1 + 1.1079

If I solve for Y1, I get Y1 would be 1 - Y2, that is that line.1089

If I want to describe that region, I can describe it in terms of Y2 first.1097

Y2 goes from 0 to 1 and Y1 goes from that left hand diagonal line Y2 -1, to the right hand diagonal line which is 1 - Y2.1101

I got a description of the region, I need to set up some double integrals.1121

Let me set up a double integral for expected value of Y1.1125

All these double integrals will have the same bound, that is one small consolation.1130

To get the value of Y1, will be the integral as Y2 goes from 0 to 1.1135

Y1 goes from Y2 -1 to 1 - Y2, just following those limits there.1142

I’m calculating the expected value of Y1.1155

I will put Y1 here and I want to put the density function that is just 1.1158

DY1, that is the inside one, and DY2.1165

Notice here that, the function F = Y1 that is what I’m integrating.1170

That is positive on the right hand triangle and negative on the left hand triangle.1176

It is symmetric, and the shape of the region we are integrating over is symmetric too.1181

This double integral is equal to 0, by symmetry.1188

This thing is completely balanced around the Y2 axis.1193

I can work out that double integral, it is not a very hard double integral.1200

But, I'm feeling a little lazy today and I do not think I want to work that out.1204

I’m just going to say by symmetry it is equal to 0, if I did work out that integral.1209

It might not be a bad idea, if you are little unsure of this to work out the double integral and1214

really make sure that you get 0, so that you believe this, if it seems a little suspicious to you.1219

In the meantime, I will go ahead and calculate the expected value of Y2.1226

Same double integral, at least the same limits, Y2 = 0, Y2 = 1, and Y1 = Y2 -1, and Y1 = 1 - Y2.1230

I'm finding the expected value of Y2, I do Y2 × 1 DY1 DY2.1248

Y2 is not symmetric over this region, because it ranges from 0 to 1.1258

I can not get away with invoking symmetry, again.1264

I have to actually do this double integral.1268

I will go ahead and do it, it is not bad.1273

I'm integrating Y2 with respect to Y1.1275

Y2 is just a constant, that can give me Y2 × Y1.1278

I’m integrating that or evaluate that from Y1 = Y2 - -1 to Y1 = 1 - Y2.1285

I get Y2 × 1 - Y2 - Y2 -1 - (Y2-1).1296

That is Y2 ×, it looks like -2Y2 -2.1314

That is -2Y2² -2Y2².1322

That was all just solving the inside integral, I need to integrate that from Y2 = 0 to Y2 = 1.1331

This is all integrating DY2.1349

I think I screwed up a negative sign in there.1357

It is not -2, it is +2.1361

That +2, when I multiply it by Y2, I made a couple of mistakes in there.1365

That is 2Y2 - 2Y2², I think that is correct now.1372

Let me go ahead and integrate that from Y2 = 0 to Y2 = 1.1379

That looks like I’m spacing out on that line right there.1386

I think I got it right now.1390

I want to integrate that, the integral of 2Y2 is Y2².1392

The integral of 2Y2² is -2/3 Y2³.1400

I need to evaluate this from Y2 = 0 to Y2 = 1.1410

If I plug in Y2 = 1, I will get 1 -2/3, plug in Y2 = 0, I just get nothing.1417

I get 1 -2/3 is 1/3, that is my expected value for Y2.1428

I should have boxed in my answer for expected value of Y1, because that was my first answer up there.1434

I had a couple of hitches along the way there, but I think it worked out okay.1441

I still need to find the expected value of Y1 Y2.1446

Again, that is that same double integral Y2 = 0 to Y2 = 1 and Y1 = Y2 -1 to Y1 = 1 - Y2.1452

Where is my function, it is Y1 × Y2 × the density function which is just 1 of DY1 DY2.1470

If that is getting a little cut off there, then I will just say that that last symbol was DY2.1483

In case, you have trouble reading that.1489

This is not as bad as it seems, because the function that I’m integrating Y1 Y2, 1493

it is going to be positive on the right hand triangle and negative on the left hand triangle.1500

They are exactly evenly balanced, is symmetric on this triangle.1506

That means that whole integral will come out to 0, by symmetry.1521

That is because, I had a function that is positive on the right hand part and negative on the left part.1528

If that feels a little suspicious to you, go ahead and do the integral.1534

It will be a little messy, it is likely the most fun integral in the world but it is possible.1538

It is just tedious algebra, there is really nothing too dangerous in there.1542

You can do the integral and you should get 0, at the end.1547

It should agree with what I got by symmetry.1550

That completes that problem, we will be using the same setup and the same values for example 2.1555

I want to make sure that you understand everything we did here,1565

because I’m going to take these answers and I’m going to use them for example 2.1568

Just to make sure that you are very comfortable with all this stuff,1574

before we move on and use these answers in example 2, let me recap the steps here.1577

I wanted to look at this triangle.1583

First of all, I graphed this triangle based on those 3 corner points that were given to me.1586

I set up this triangle and colored in the region here.1592

I know that I’m going to be doing a double integral, I was trying to describe the limits of this triangle.1597

If I list Y2 first then I can do it just by doing one double integral.1606

If I list Y1 first, then I have to chop this thing into two.1612

That is why I listed Y2 first going from 0 to 1.1616

And then, I had to describe Y1 in terms of these two lines.1619

The way I got those two lines was, I found the equations of these two diagonal lines on the sides of the triangles.1624

Then, I solve each one for Y1.1632

That is where those limits right there came from, in terms of Y1.1635

That let me set up my limits of integration on each of my three integrals.1639

Those sets of limits, I use the same sets of limits on each one of those integrals.1644

Each one of those integrals, I integrated something different.1655

They all had this 1 in them, and that 1 came from this one right here, the density functions.1658

That 1 manifested itself there, there, and there.1664

I had the 1, then each one I was integrating something different because 1669

I was trying to find the expected value of something different.1672

Expected value of Y1, integrate Y1.1676

Expected value of Y2, integrate Y2.1678

Expected value of Y1 Y2, integrate Y1 Y2.1682

I know I have set up three integrals and two of them, I notice that the function I’m integrating is symmetric,1687

in the sense that Y1 is going to be positive over here and negative over on this region.1694

It exactly bounces each other out, I know that I’m going to get 0, if I do that integral.1701

If you do not believe that, just do the integral, it would not be that hard.1707

It should work out to be 0.1710

Same thing over here, Y1 Y2 is positive in the positive region, negative in the second quadrant.1711

It is going to give me 0, by symmetry.1720

That is where I got these two 0’s from.1722

If you do not like it, just do the integrals and you can work them out yourself.1725

The expected value of Y2, Y2 is positive on both of those triangles.1729

I cannot use symmetry, I actually have to do the integral.1734

I integrated with respect to Y1, I got Y2 Y1.1738

Plugged in my limits, I got an integral in terms of Y2.1741

Integrated that, plug in my limits and got the expected value of Y2 to be 1/3.1746

Hang onto these 3 answers, we are using it again right away in example 2.1752

In example 2, we have the same setup from example 1.1758

Let me go ahead and draw out that setup.1762

It was this triangle with corners at -1,0 and 0,1 and 1,0.1764

Let me draw that triangle, it is the same thing we had in example 1 and the same density function.1773

A joint density function is always equal to 1 over that region.1782

What we want to do here is, we want to calculate the covariance of Y1 of Y2.1789

After that, we are going to ask whether Y1 and Y2 are independent.1795

I’m going to use that formula that I gave you for covariance of Y1 and Y2.1799

Not the original definition which I think is not very useful, is often cumbersome if you use the original definition of covariance.1804

We have this great formula for covariance.1813

What covariance of Y1 and Y2, is equal to the expected value of Y1 × Y2 - the expected value of Y1 × the expected value of Y2.1815

You cannot necessarily separate the variables, it is not necessarily true that1831

the expected value of Y1 × Y2 is equal to the expected value of Y1 × the expected value of Y2.1836

The reason this formula is useful for us right now is,1845

we already figured out each one of these quantities in example 1.1848

The work here, it was quite a bit of work was done here in example 1.1856

If you did not just watch example 1, if example 1 is not totally fresh in your mind, just go back and watch it right now.1862

You will see where we work out all of these quantities individually.1869

The expected value of Y1 × Y2 was 0.1874

The expected value of Y1 was also 0.1878

The expected value of Y2 was 1/3.1881

That is what we did in example 1.1884

This all simplifies down to be 0 here.1886

That is our answer for the covariance, the covariance is 0.1890

If Y1 and Y2 are independent, this is a very subtle issue here because you see that1896

the covariance being 0 and you might think of independence, that is the converse of our theorem. 1902

Our theorem says that, if they are independent then the covariance is 0.1909

It does not work the other way around.1915

We cannot say necessarily that they are independent yet.1918

In fact, I want to remind you of a theorem that we had back in another lecture on independent variables.1921

From the lecture on independent variables, that was several lectures ago.1934

You can go back and check this out, if you do not remember it.1942

We have a theorem that the region must be a rectangle.1945

And then, there was another condition that the joint density function must factor into two functions.1964

One just of Y1 and one just of Y2.1975

If both of those conditions were satisfied then the variables are independent, that was if and only if.1981

In this case, we do not have a rectangle.1989

This is not a rectangle, it is a triangle.1993

That theorem tells us that Y1 and Y2 are not independent.2004

That should kind of agree with your intuition, if you look at the region, if I tell you for example that Y1 is 0.2018

Let me put some variables on here.2029

This is Y1 on the horizontal axis and this is Y2 on the vertical axis.2031

If I tell you that Y1 is 0 that means we are on this vertical axis, then Y2 could be anything from 0 to 1.2036

If I tell you that Y1 is ½ then Y2 cannot be anything from 0 to 1, it could be only as big as ½.2053

By changing the values of Y1, I’m changing the possible range of Y2, 2064

which suggests that knowing something about Y1 would give me new information about Y2.2071

That is the intuition for dependents, for variables not being independent.2077

You should suspect, by looking at that region, that the variables are not independent.2086

This theorem from that old lecture confirms it.2092

Just to recap here, we are asked to find the covariance.2097

I'm using the formula that I gave you for covariance.2101

I think it is on the first or second slide of this lecture, just scroll back and you will see it,2105

the definition and formulas for covariance.2111

It expands out into these three expected values.2113

We calculated all these in example 1.2117

I just grabbed the old values from example 1 and simplify down, just got 0,2119

which might make you think independent because we had this theorem that if they are independent, then the covariance is 0.2125

The converse of that theorem is not true.2133

This is kind of the classical example of that.2137

Variables can have a covariance equal to 0 and not be independent.2141

This is really what this example is showing, not be independent.2157

It is true that if their independent then the covariance is 0, but they can have covariance 0 and in this case,2161

since the region is not a rectangle, they are not independent.2168

In example 3, we have independent random variables.2176

We have been given their means and their variances.2179

We are given a couple of linear functions U1 is Y1 + Y2 and U2 is Y1 - Y2.2182

We want to calculate the variance of U1 and U2.2190

Let me show you how those work out.2195

The variance of U1, U1 is, by definition is Y1 + 2Y2.2197

I'm going to use the theorem that we had on linear functions of random variables.2212

This was on the introductory slides, you can go back and just read back about that theorem, 2221

about linear functions of random variables.2229

What that told me is that, a linear combination of random variables distributes out 2232

but you write the coefficients 1² × σ 1², the variance of Y1.2238

Let me write that as the variance of Y1, 1² × the variance of Y1 + 2² × the variance of Y2.2247

And then, there is this other term which is kind of obnoxious but we have to deal with it.2257

It is 2 × the coefficient 1 × 2 × the variance of Y1 with the covariance of Y1 with Y2.2261

That is by the theorem from the beginning of this.2281

I think the title is linear functions of random variables.2290

That was the theorem that we had.2296

Let me plug in, 1² is just 1, the variance of Y1 we are given is 4 + 2² is 4.2299

The variance of Y2 was given to be 9 + 4.2306

The covariance of Y1 × Y2, what we are given is that Y1 and Y2 are independent.2313

If they are independent, then the covariance is 0.2320

Remember, in example 2, we learned the converse is not true.2324

But, it is true that if they are independent then the covariance is 0.2326

That right there is by independence and by the theorem that we learned earlier on in this lecture.2331

We just simplify, 4 + 4 × 9 is 4 + 36 which is 40.2341

That is the variance of U1, the variance of U2 is, by definition U2 is Y1 - Y2.2349

I will just do the same thing, I’m going to expand it out using that theorem.2360

We got 1Y1, 1² × the variance of Y1 + -1² × the variance of Y2, I’m using that theorem, 2365

you have to square the coefficients.2375

+ 2 × 1 × -1, those are the coefficients × the covariance of Y1 with Y2 and 1.2378

We just get 1 × the variance of Y1 which were given is 4 +, because -1² gives us a positive then Y2 is 9.2390

Again, the covariance of Y1 Y2, by independence is 0.2403

This just simplifies down to 13 is the variance of U2, and we are done.2408

To recap the steps there, the variance of U1, expand that out.2416

U1 was defined to be Y1 + 2Y2.2420

We had this theorem on linear functions of random variables.2425

Go back and check it out, it told us how to find the variance of a linear combination.2428

You kind of expand out the variances, you have to square the coefficients.2435

That 1² comes from the sort of a hidden 1 right there, and that 2² comes from there.2439

This 2 comes from the theorem.2448

This 1 and this 2 come from the coefficients, there and there.2451

This covariance comes from the theorem as well.2456

We are given that Y1 and Y2 are independent, independence tells us that their covariance is equal to 0.2464

That was a theorem that we also had earlier on in this lecture.2471

The variance of Y1 is 4, that came from here.2476

The variance of Y2 is 9, that came from the stem of the problem here.2481

We drop those in, simplify down, and we get 4 + 4 × 9 is 40.2486

The variance of U2 works exactly the same way, except that there are coefficients now.2493

Instead of being 1 and 2, are 1 and -1.2498

The covariance drops out because they are independent.2501

We get 4 and 9, remember the -1 squares because you always square the coefficients with variance.2505

That makes a positive and 4 + 9 is 13.2512

By the way, we are going to use these values again, in example 4.2517

Make sure you understand these values very well, before you move on to example 4.2525

We will need them again.2531

Example 4, we are going to carry on with the same sets of data that we had from examples 3.2534

Y1 Y2 are independent variables, we have been given their means and their variances.2540

We are going to let U1 and U2 be these linear combinations.2547

I want to find the covariance of U1 and U2.2551

And then, I'm also going to find this ρ here, is the correlation coefficient.2557

We learn about that, at the beginning of this lecture.2561

Let me expand out the covariance of U1 and U2.2565

U1 and U2 is the covariance of Y1 + 2Y2 and Y1 - Y2, that is just the definition of U1 and U2.2569

And then, there was a theorem that we had to back at the beginning of this lecture 2584

on how you can expand covariances of combinations of a random variables.2590

It says, it expands out linearly so it kind of expands out.2598

You can almost think of foil, first outer, inner last.2604

It is the covariance of Y1 with Y1 - the covariance of Y1 with Y2 + 2 × the covariance of Y2 with Y1.2608

I’m just expanding it out, first outer and inner last, -2 × the covariance of Y2 with Y2.2629

It is just expanding out each term with each term.2640

We are going to use a couple other facts here.2645

Remember that, the covariance on Y1 with itself is just the variance of Y1.2647

That was something we learned on the very first slide of this lecture, 2653

it was the one where we introduced covariance.2660

I gave you, first the definition of covariance and a couple useful formulas.2663

One useful formula was the covariance of a variable itself is just the variance.2667

We also have theorem that said, if Y1 Y2 are independent and they are given to be independent here,2672

then the covariance is 0.2679

This is 0 by independence.2681

This covariance is also 0, by independence.2689

Covariance of Y2 with itself is the variance of Y2, that same formula that we had on the first slide.2693

This is the variance of Y1 -2 × the variance of Y2.2703

Where is the variance of Y1, there it is, it is 4 -2 × the variance of Y2 is 9.2708

That is 4 -18 which is -14, that is our covariance.2717

We also have to find the correlation coefficient.2726

Let me remind you that the correlation coefficient, how that is defined.2729

The correlation coefficient, ρ of U1 U2, by definition is the covariance of U1 with U22734

divided by the standard deviation of U1 and the standard deviation of U2.2750

I’m going to use σ for the standard deviation of U1 × the standard deviation of U2.2758

That σ is the σ for U1, it is not the σ for Y1.2764

It is not the σ 1 right here, that σ U2 is not the σ 2 there.2769

That was a mistake that I accidentally made, when I was making a rough draft of these notes.2776

Do not make the same mistake I did.2781

The covariance of U1 U2 is –14, we just figure that out.2785

The standard deviation of U1, I’m going to invoke what I figured out on the previous example.2790

The standard deviation of U1 is just the square root of the variance of U1, and the same for U2, 2801

the square root of the variance.2810

The standard deviation is always the square root of the variance, that is the definition.2811

I figured out the variance of U1 and U2, in the previous example, in examples 3.2817

I’m trying to look up those values right there.2826

We work this out in examples 3.2831

We solved -14 in the numerator.2839

In example 3, we figure out that the variance of U1 was 40, we have √ 40.2841

The variance of U2 was 13, that is what we figured out in example 3.2848

If you did not just watched example 3 and you think those numbers appeared magically,2856

go back and watch example 3, and you will see where they came from.2860

This does not get much better, this is -14, √ 40 we can pullout a 4, make that 2 √ 10.2864

√ 13 in the denominator and that turns into, if we cancel 2, we get -7/10 × 13 is 130.2874

Not a particularly nice number, I could not find figure out a way rake up these numbers to behave nicely.2886

I did throw this into a calculator and what did I get there, when I plugged in a calculator.2892

This was 0 × 0.614, actually that was an approximation and it is negative.2902

That is what I got when I plug that number into a calculator, nothing very revealing there.2911

Let me mention that one thing we knew about the correlation coefficient, 2917

I gave you this way back on the third slide of this lecture.2923

When I talked about correlation coefficient is that, the whole point of correlation coefficient is its scale independent.2929

It is always between -1 and 1, and this is between -1 and 1 because it is -0.6.2940

That is slightly reassuring, if it had been outside of that range then2949

I would have known that I have made a mistake, somewhere through here.2955

The fact that I got a number in between -1 and 1, does not guarantee that right 2958

but it makes me feel a little more confident in my work here.2962

We got answers for both of those, let me show you how I got those.2966

The covariance of U1 and U2, I expanded out the definition of U1 and U2 which is Y1 + 2Y2 and Y1 -Y2.2970

We had this theorem on one of the introductory slides to this lecture which said, 2982

how you can expand out covariances of linear combinations.2987

It is very well behaved, it just expands out the same way you would multiply together binomial.2991

You can think of foiling things together.2996

We did the first Y1 and Y1, the outer Y1 and Y2, subtracted because of the coefficient there.2999

Let me write out foil here, if you remember your high school algebra, first outer, inner last.3008

The inner term is 2Y2 × Y1, the covariance of it and the last term is -2Y2 Y2.3013

These mixed terms, the Y1 and Y2, remember they were given that the variables are independent.3023

Independent variables have covariance 0.3030

That is not true in the other direction, just because they have covariance 0, does not mean they are independent.3033

If they are independent, then the covariance is definitely 0, which is great, those two terms dropout.3039

The other thing we learned by an early formula, when I first taught you about covariance3045

is that the covariance of Y1 with itself is just the variance of Y1 and same thing for Y2.3052

I can just drop in my value for the variance, there it is right there.3061

The variance of Y1 is 4, variance of Y2 is 9, and drop those in and simplifies down to -14.3064

The correlation coefficient ρ, by definition, is the covariance of those two variables divided by their standard deviations.3074

I just figure out the covariance, that is the -14.3083

The standard deviations are always the square root of their variances, that is the definition of standard deviation.3086

Take their variance and take the square root.3093

The variance of U1 and U2, that is what I calculated back in example 3.3096

Just go back and watch example 3, you will see where these numbers 40 and 13 are coming from.3103

It is not these numbers right here, the σ 1² and σ 2² because those were the variances for Y1 and Y2.3109

Here, we want the variances of U1 and U2.3118

Once, I got those numbers in there, I reduce the square roots a little bit, but it did not end up being a very nice number.3122

The reassuring thing was that when I found the decimal, it was between -1 and 1, 3129

which is the range that a correlation coefficient should always landed.3135

Last example here, we got independent variables but they all have the same mean and the same variance.3141

We want to find the mean and the variance of the average of those variables.3148

The average just means, you add them up and divide by the number of variables you have.3152

It is 1/N Y1 up to 1/N YN.3158

I wrote it that way to really suggest that, that is a linear combination of the original random variables.3162

This is a linear function and we can use our theorem on how you calculate means and variances of linear combinations.3169

That is what I'm going to use.3183

The expected value is the same as the mean.3185

Remember, the expected value of Y bar, the mean and variance of the average.3188

The expected value of Y bar is just the expected value of 1/N Y1 up to 1/N YN.3195

I can distribute by linearity of expectation, that was a theorem that we had.3208

I can distribute and pull out those coefficients 1/N × E of Y1 up to 1/N × E of YN.3217

That is 1/N × μ up to 1/N × μ.3231

They all have the same mean μ.3236

If you add up N copies of 1/N × μ, you just get a single copy of μ.3239

That is my expected value of the average.3249

The variance of the average, variance of Y bar, again is the variance of 1/N Y1 + up to 1/N YN.3252

Variance is not linear, expectation is linear.3265

There is a nastier theorem that tells you what to do with variance.3270

I gave you that theorem in one of the introductory slides.3277

I said linear functions of random variables.3280

The way this works is, you pull out the coefficients but you square them.3284

1/N² × the variance of Y1 up to 1/N² × the variance of YN.3290

There is the cross terms, there is this cross term which is 2 × the sum as i is bigger than j 3300

of the coefficients 1/N × 1/N × the covariance of YI with YJ.3310

That looks pretty dangerous there but, let us remember that we are given that we have independent variables.3321

Any Y and J, if you take their covariance, since they are independent, this covariance will be 0, by independence.3331

That is a really nice, that means I can just focus on the first few terms here, 1/N²3342

the variance of Y1 is σ² + up to 1/N² × σ².3350

Let me write that a little more clearly, that is 1/N² in the denominator there.3360

What I have is N terms here of 1N² × σ².3368

That simplifies down to σ²/N.3374

By the way, this is a very fundamental result in statistics.3379

This is something that you use very often, as you get into statistics.3385

The variance of the mean is equal to σ² divided by N.3390

This is where it comes from, this is where the magic starts to happen is right here with this example.3395

Let me make sure that you understand every step here.3402

We want to find the expected value of Y bar.3405

Remember that, Y bar the average is just can be written as a linear combination of these variables.3408

Expectation is linear, that is what we learned in that theorem.3415

You can just separate it out into the expected values of Y1 up to YN, then just pull out the constants.3419

And then, each one of those E of Yi is μ because we are given that in the problem right there.3426

We are adding up N copies of 1/N × μ.3437

At the end, we just get a whole μ.3441

The variance is a little bit messier, the variance of a linear combination.3443

Again, you can split up into all the separate variances but when you pull it out, pull out the coefficients, it get².3450

That is why we get 1/N² on each of these coefficients.3457

There is this cross term, 2 × the sum of the coefficients.3462

That is a little messy there but that is 1/N × 1/N.3467

That is coming from these coefficients right here, the covariance of Yi × Yj.3471

The fortunate thing is that, we have a theorem that says when two variables are independent, their covariance is 0.3480

Converse of that is not true, it could be covariance is 0 without independence.3488

But if they are independent, their covariance is definitely 0.3493

We are given that they are independent here.3497

All those cross terms dropout and we are just left with N copies of 1/N².3500

We are given that the variance of each individual variable σ².3507

N × 1/N² is just 1/N, we still have that σ².3512

The variance of the mean is σ²/N.3518

Essentially, this means if you take the average of many things, 3522

it is not going to vary as much as individual members of the population will,3525

because the variance shrinks down, as you take a larger and larger sample.3531

That is a very classic result in statistics, now you know where it comes from.3537

Now, you know where this classic formula comes from.3542

That wraps up our lecture, a kind of a big one today on correlation and covariance, and linear functions of random variables.3546

This is all part of the chapter on Bivariate distribution functions and Bivariate density functions.3554

In fact, this wraps up our chapter on Bivariate density functions and distribution functions.3562

We will come back later and talk about distributions of random variables.3568

We still have one more chapter to go.3573

In the meantime, it is nice to finish our chapter on Bivariate density and distribution functions.3575

This is all part of the probability lecture series here on

I'm your host and guide, my name is Will Murray, thank you for joining me today, bye now.3586