WEBVTT mathematics/linear-algebra/hovasapian
00:00:00.000 --> 00:00:04.000
Welcome back to Educator.com, welcome back to linear algebra.
00:00:04.000 --> 00:00:26.000
In our previous lesson, we discussed Eigenvalues, Eigenvectors, and we talked about that diagonalization process where once we find the specific Eigenvalues from the characteristic polynomial that we get from the determinant, setting it equal to 0, once we find the Eigenvalues we put those Eigenvalues back into the arrangement of the matrix.
00:00:26.000 --> 00:00:42.000
Then we solve that matrix in order to find the particular Eigenvectors for that Eigenvalue, and the space that is spanned by the Eigen vectors happens to be called an Eigenspace.
00:00:42.000 --> 00:00:48.000
In the previous lessons, we dealt with some random matrices... they were not particularly special in any sense.
00:00:48.000 --> 00:00:57.000
Today, we are going to tighten up just a little bit, we are going to continue to talk about Eigenvalues and Eigenvectors, but we are going to talk about the diagonalization of symmetric matrices.
00:00:57.000 --> 00:01:05.000
As it turns out, symmetric matrices turn up all over the place in science and mathematics, so, let us jump in.
00:01:05.000 --> 00:01:14.000
We will start with a - you know - recollection of what it is that symmetric matrices are. Then we will start with our definitions and theorems and continue on like we always do.
00:01:14.000 --> 00:01:29.000
Let us see here. Okay. Let us try a blue ink today. So, recall that a matrix is symmetric if a = a transpose.
00:01:29.000 --> 00:01:48.000
So, a symmetric matrix... is when a is equal to a transpose, or when the a transpose is equal to a.
00:01:48.000 --> 00:01:58.000
So, it essentially means that everything that is on the off diagonals is reflected along the main diagonal as if that is a mirror.
00:01:58.000 --> 00:02:25.000
Just a quick little example, something like (1,2,3,3)... that is... so let us say this is matrix a. If I were to transpose it, which means shift it along its main diagonal, well, (1,2,3,3)... this is equal to a transpose... it is the same thing. (1,2,3,3), (1,2,3,3), this is a symmetric matrix.
00:02:25.000 --> 00:02:44.000
Okay. Now, we will start off with a very, very interesting theorem. So, you recall, you know, you can take this matrix, we can set up that equation and where we took the Eigenvalue equation where you have λs and the characteristic polynomial, and we solve the polynomial for its roots.
00:02:44.000 --> 00:03:15.000
The real roots of that equation are going to be the Eigenvalues of this particular matrix. Well, as it turns out, all the roots of what we says is f(λ), which is the characteristic polynomial... λ(a)... symmetric matrix are real numbers.
00:03:15.000 --> 00:03:25.000
So, as it turns out. If our matrix happens to be symmetric, we know automatically from this theorem that all of the roots are going to be real.
00:03:25.000 --> 00:03:42.000
So, there is always going to be a real Eigenvalue. Now, we will throw out another theorem, which will help us.
00:03:42.000 --> 00:04:10.000
If a is a symmetric matrix, then Eigenvectors belonging to distinct Eigenvalues, because you know sometimes Eigenvalues, they can repeat.
00:04:10.000 --> 00:04:23.000
Eigenvalues are orthogonal. That is interesting... and orthogonal, as you remember, dot product is equal to 0, or perpendicular.
00:04:23.000 --> 00:04:30.000
Okay. Once again, if a is a symmetric matrix, then the Eigenvectors belonging to distinct Eigenvalues are orthogonal.
00:04:30.000 --> 00:04:48.000
Let us say we have a particular matrix, a 2 by 2 and let us say the Eigenvalues that I get are 3 and -4. Well, when I calculate the Eigenvectors for 3 and -4, as it turns out, those vectors that I get will be orthogonal. Their dot product will always equal 0.
00:04:48.000 --> 00:05:10.000
So, let us do a quick example of this. We will let a equal (1,0,0), (0,0,1), (0,1,1), and if you take a quick look at it, you will realize that this is a symmetric matrix. Look along the main diagonal.
00:05:10.000 --> 00:05:19.000
If I flip it along the main diagonal, as if that is a mirror, (0,0), (0,0), (1,1).
00:05:19.000 --> 00:05:30.000
When I subject this to mathematical software, again, when you first are dealing with Eigenvectors, Eigenvalues I imagine your professor or teacher is going to have you work by hand, simply to get you used to working with the equation.
00:05:30.000 --> 00:05:44.000
Just to get you an idea of what it is that you are working with, some mathematical object. But, once you are reasonably familiar, you are going to be using mathematical software to extract these Eigenvalues, and Eigenvectors. But, sometimes the process just takes too long otherwise.
00:05:44.000 --> 00:05:57.000
So, what we get is... well, λ1, -- let me start over here -- the first Eigenvalue is equal to 1, and that yields the Eigenvector (1,0,0).
00:05:57.000 --> 00:06:12.000
λ2, the second Eigenvalue is 0, 0 is a real value, and it yields the Eigenvector, -- tuh, Eigenvalue, Eigenvector, Eigenspace, yeah... I know -- Okay.
00:06:12.000 --> 00:06:25.000
That gives me the vector (0,-1,1)... λ3, the third Eigenvalue is 2 for this matrix, and it yields the Eigenvector (0,1,1).
00:06:25.000 --> 00:06:36.000
If you were to check the dot product of this and this, this and this, this and this, the mutual dot products, they all equal 0. So, as it turns out, this theorem is confirmed.
00:06:36.000 --> 00:06:45.000
The Eigenvectors corresponding to distinct Eigenvalues are mutually orthogonal. Okay.
00:06:45.000 --> 00:07:15.000
Now, let us move onto another definition. Okay. A non-singular -- excuse me -- matrix a, remember non-singular means invertible, so it has an inverse... is called orthogonal.
00:07:15.000 --> 00:07:32.000
The word orthogonal in 2 different ways. We are using it to apply to two vectors when the dot product is 0, but in this case we call them matrix orthogonal if the inverse of the matrix happens to equal the transpose of the matrix.
00:07:32.000 --> 00:07:45.000
An equivalent statement to that... I will put equivalent, is a transpose a is equal to the identity matrix.
00:07:45.000 --> 00:08:01.000
Well, just look at what happens here. If I take a, this says a inverse is equal to a transpose. Well, if I multiply by the matrix a on both sides on the right, a transpose a, that is this one, a inverse a is just the identity matrix, so these are two equivalent statements.
00:08:01.000 --> 00:08:16.000
I personally prefer this definition right here. So, a non-singular matrix is called orthogonal, so it is an orthogonal matrix if the inverse and the transpose happen to be the same thing. That is a very, very special kind of matrix.
00:08:16.000 --> 00:08:44.000
So, let us do a quick example of this. If I take the Eigenvectors that I got from the example that I just did, so the Eigenvectors that I just got were (1,0,0), (0,-1,1), and (0,1,1), okay? These are for the respective Eigenvalues 1, 0, 2.
00:08:44.000 --> 00:08:56.000
First thing I am going to do, I am actually going to normalize these. So normalization, it just means taking them and dividing by the length of the vector. So, this vector actually... let me use red here for normalization.
00:08:56.000 --> 00:09:21.000
This one stays (1,0,0), so let me put... normalized... this one is just, well, -1², 1², this becomes (0,-1/sqrt(2),1/sqrt(2)). The length of this vector is sqrt(2).
00:09:21.000 --> 00:09:38.000
This one is the same thing. We have (0,1/sqrt(2),1/sqrt(2)), now if I take these vectors and set them up as columns in a matrix, and this is just something random that I did. I happened to have these available, so let us call this p.
00:09:38.000 --> 00:09:53.000
p is equal to the matrix (1,0,0), (0,-1/sqrt(2),1/sqrt(2)), (0,1/sqrt(2),1/sqrt(2).
00:09:53.000 --> 00:10:02.000
This matrix p, if I were to calculate its inverse, and if I were to calculate its transpose, they are the same.
00:10:02.000 --> 00:10:10.000
p inverse equals p transpose. This is an orthogonal matrix.
00:10:10.000 --> 00:10:29.000
So again, we are using orthogonal in two different ways. They are related, but not really. We call vectors mutually orthogonal, we call them matrix orthogonal, if the inverse and the transpose are the same thing.
00:10:29.000 --> 00:10:40.000
Now, let us go back to blue ink here, and state another theorem.
00:10:40.000 --> 00:11:19.000
An n by n matrix is orthogonal if, and only if, the columns or rows, so I will put rows in parentheses form an ortho-normal set of vectors in RN.
00:11:19.000 --> 00:11:27.000
Okay. An n by n matrix is orthogonal if and only if the columns form an orthonormal set of vectors in RN.
00:11:27.000 --> 00:11:52.000
So, if I have a matrix, and let us just take the columns... if the columns form an ortho-normal set, meaning that the length of... column 1 is a vector, column 2 is a vector, column 3 is a vector... if the length of those three is 1, that is the normal part, and if they are mutually orthogonal, well, this thing that we did right here... these columns we normalized it.
00:11:52.000 --> 00:12:03.000
So, by normalizing it, we made the length 1 and these are mutually orthogonal, so this is an orthogonal matrix. If we did not know it already by finding the inverse and the transpose.
00:12:03.000 --> 00:12:17.000
If I just happen to look at this and realize that, whoa, these are all normalized and these are mutually orthogonal. Then, I can automatically say that this is an orthogonal matrix, and I would not have to calculate anything. That is what this theorem is used for.
00:12:17.000 --> 00:12:25.000
Okay, so now let us talk about a very, very, very important theorem. Certainly one of the top 5 in this entire course.
00:12:25.000 --> 00:12:31.000
It is quite an extraordinary theorem when you see the statement of it and when we talk about it a little bit. Let me do it in red here.
00:12:31.000 --> 00:13:54.000
So -- excuse me -- if a is a symmetric n by n matrix. Then there exists an orthogonal matrix p, such that p inverse × a × p is equal to some diagonal matrix d, a diagonal matrix, with the Eigenvalues of a along the main diagonal.
00:13:54.000 --> 00:14:11.000
Okay, so not only a symmetric matrix always diagonalizable, but I can actually diagonalize it with a matrix that is orthogonal, where the columns and the rows are of length 1 and they are mutually orthogonal. Their dot product equals 0.
00:14:11.000 --> 00:14:26.000
That is really, really extraordinary, so let us state this again. If a is a symmetric n by n matrix, then there exists an orthogonal matrix p such that p inverse × a × p gives me some diagonal matrix.
00:14:26.000 --> 00:14:35.000
The entries along the main diagonal are precisely the Eigenvalues of a. That is what this equation tells me, that there is this relationship.
00:14:35.000 --> 00:15:00.000
If I have a matrix a, I can actually take the Eigenvalues of a, I will bring them along the main diagonal and I can find a matrix p, such that when I take p inverse, when I sandwich a between p inverse and p, I actually produce that diagonal by composing the multiplication of this matrix and this matrix and this matrix. That is extraordinary, absolutely extraordinary.
00:15:00.000 --> 00:15:05.000
So, let us see what happens when we are faced with an Eigenvalue which is repeated.
00:15:05.000 --> 00:15:20.000
Remember, sometimes you can have an Eigenvalue, your characteristic polynomial can have repeated roots... so that will be a, let us say you have a 3 by 3, and you have Eigenvalues (1,1,2), well the 1 has a multiplicity of 2, because it shows up twice.
00:15:20.000 --> 00:15:36.000
Okay, let us see how we deal with that. Let us go back to a blue ink here... oops.
00:15:36.000 --> 00:16:52.000
If we are faced... an Eigenvalue of multiplicity k, then, when we find a basis for the null space associated with this Eigenvalue, in other words finding a basis for the Eigenspace, finding the Eigenvectors, that is all this means because that is what you are doing... you put the Eigenvalue back in that equation, you solve the homogeneous equation and you get a basis for the null space, which is the Eigenvectors associated with this Eigenvalue.
00:16:52.000 --> 00:17:29.000
We use the Gram Schmidt ortho-normalization process to create an orthonormal basis for that Eigenspace.
00:17:29.000 --> 00:17:48.000
So if I have an Eigenvalue which repeats itself, and once I find a basis for that Eigenspace, for that particular Eigenvalue, I can ortho-normalize and actually create vectors that are, well, orthonormal, and that will be my one set. Then I move on to my next Eigenvalue.
00:17:48.000 --> 00:18:00.000
If my matrix is symmetric, I am guaranteed that the distinct Eigenvalues will give me things that are going to be mutually orthonormal.
00:18:00.000 --> 00:18:07.000
Let us do a problem, and I think everything will fall into place very, very nicely.
00:18:07.000 --> 00:18:23.000
So, example... we will let a = (0,2,2), (2,0,2), and (2,2,0)... 2... 2... 0...
00:18:23.000 --> 00:18:34.000
Let us confirm that this is diagonal. Yes. 2, 2, 2, 2, 2, 2, absolutely. Main diagonal is the mirror. If you flip it you end up with the same thing.
00:18:34.000 --> 00:18:59.000
Okay. Let us do the characteristic polynomial. Let us actually do this one a little bit in detail. It equals the determinant λ - 0 - 2 - 2, λ's along the diagonals and negatives everywhere else... λ - 0... -2... -2... -2... λ - 0.
00:18:59.000 --> 00:19:09.000
We want the determinant of this. When we take the determinant of this, we actually end up with the following... λ + 2 in factored form... λ - 4.
00:19:09.000 --> 00:19:20.000
So, I have solved for this polynomial and I have turned it into something factored. So, I get -- let me put it over here -- λ1 = 2... -2, I am sorry.
00:19:20.000 --> 00:19:30.000
λ2 is also equal to 2, that is what this 2 here means. Okay. That means this Eigenvalue λ = -2 has a multiplicity of 2, it shows up twice.
00:19:30.000 --> 00:19:42.000
Of course, our third λ, third Eigenvalue is going to equal 4. So, now let us go ahead and do solve this homogeneous system.
00:19:42.000 --> 00:19:53.000
Well, I take -2, I stick it into here, and I solve the homogeneous system. So, I end up with the following.
00:19:53.000 --> 00:20:13.000
I end up -- let me actually write... let me do this... no, it is okay -- so 4λ = -2, we get the following system... we get -2, - 2, -2, 0.
00:20:13.000 --> 00:20:25.000
It is this thing, and then the 0's over here, -2, -2, -2, 0. -2, -2, -2, 0.
00:20:25.000 --> 00:20:39.000
Well, when we subject that to reduced row echelon form, we end up with 1,1,0, and 0's everywhere else.
00:20:39.000 --> 00:20:52.000
So, this column, this column... so, we get -- let me do it this way -- s3, let us set it equal to s, this does not have a leading entry, so it is a random parameter.
00:20:52.000 --> 00:21:00.000
x2 also does not have a leading entry. Remember this does not have to be in diagonal form, so this is the only one that has to be a leading entry.
00:21:00.000 --> 00:21:11.000
So, set that equal to r, and x1 is equal to, well, -r, -s.
00:21:11.000 --> 00:21:31.000
This is equivalent to the following... r × -1, 1, 0 + s × -1,0,1.
00:21:31.000 --> 00:21:45.000
Okay. So, these 2 vectors right here form a basis for our Eigenspace. They are our Eigenvectors for this, for these Eigenvalues.
00:21:45.000 --> 00:21:56.000
Well, what is the next step? We found the basis, so now we want to go ahead and we want to ortho-normalize them.
00:21:56.000 --> 00:22:03.000
We want to make them orthogonal, and then we want to normalize them so they are orthonormal. So, we go through the Gram Schmidt process.
00:22:03.000 --> 00:22:35.000
So, let me rewrite the vectors. I have (-1,1,0) -- so that we have them in front of us -- (0,1)... is a basis for the Eigenspace associated with λ = -2. Okay.
00:22:35.000 --> 00:22:48.000
So, we know that our first v1, this is going to be the first vector... we can actually take this one. So, I am going to let v1 = -1, 1, 0.
00:22:48.000 --> 00:22:53.000
That is going to be our standard. We are going to orthogonalize everything with respect to that one.
00:22:53.000 --> 00:23:19.000
Well, v2 is equal to... this is u1, this is u2... is equal to u2 - u2 ⋅ v1 over v1 ⋅ v1 × v1.
00:23:19.000 --> 00:23:29.000
This is the definition of the ortho-normalization process, the Gram Schmidt process. You take the second vector, and you subtract... you work forward.
00:23:29.000 --> 00:23:36.000
I will not recall the entire formula here, but you can go back and take a look at it where we did a couple of examples of that orthogonalization.
00:23:36.000 --> 00:23:53.000
When you put all of these in, u2 is this one, v1 is this one, and you do the multiplication, you end up with the following... -1/2, -1/2, 1/2 and 1.
00:23:53.000 --> 00:24:07.000
Okay. Now, you remember I do not need the fractions here because a vector in this direction is... well, it is in the same direction, so the length of these individual values does not really matter.
00:24:07.000 --> 00:24:33.000
So, I am just going to take -1, -1, 1. Okay. So, now, -1, 1, 0... and -1, -1 -- I am not taking fractions here, what I am doing is I am actually multiplying everything by 2.
00:24:33.000 --> 00:24:44.000
I can multiply a vector by anything because all it does is extend the vector or shorten the vector, it is still in the same direction, and it is the direction that I am interested in.
00:24:44.000 --> 00:24:53.000
So, when I multiply by 2 -- this is not 1 -- 2 × that... and this ends up being 2 here. Okay.
00:24:53.000 --> 00:25:04.000
Now, this is orthogonal. I want to normalize them.
00:25:04.000 --> 00:25:37.000
When I normalize them, I get the following -- nope, we are not going to have these random lines everywhere --... -1/sqrt(2), 1/sqrt(2), 0... and 2 × 2 is 4, 1, 1, sqrt(6), so this is going to be... -1/sqrt(6), -1/sqrt(6), 2/sqrt(6).
00:25:37.000 --> 00:26:03.000
This is orthonormal. So, with respect to that Eigenvalue -2, we have created an orthonormal basis for its Eigenspace. So this is going to be one column, this is going to be a second column, now let us go ahead and do the next Eigenvalue -- where are we... here we are.
00:26:03.000 --> 00:26:27.000
Our other Eigenvalue was λ = 4, so for λ = 4, we put it back into that, remember λ thing determinant equation... we end up with the following. We get 4, - 2, - 2, 0... -2, 4, -2, 0... -2, -2, 4, 0.
00:26:27.000 --> 00:26:43.000
When we subject this to reduced row echelon we get 1, 0, -1, 0. We get 0, 1, -1, 0, 0 here... and 0's everywhere else.
00:26:43.000 --> 00:26:56.000
Okay. That is a leading entry. That is a leading entry. Therefore, that is not a leading entry, so we can let that one be x3 = r. Any parameter.
00:26:56.000 --> 00:27:10.000
Well, that means x2 - r = 0, so x2 = r, as well... and here it is x1 - r = 0, so x1 also equals r.
00:27:10.000 --> 00:27:19.000
Therefore, this is equivalent to r × 1, 1, 1. Okay.
00:27:19.000 --> 00:27:33.000
So, this right here is an Eigenvector for λ = 4. It is one vector, it is a one dimensional Eigenspace. It spans the Eigenspace.
00:27:33.000 --> 00:28:00.000
Now, we want to normalize this. So, when we normalize this, it is sqrt(3)... I will put normalize -- let me make some more room here, I am going to use up a lot of room for not a lot of... let me go this way -- normalize.
00:28:00.000 --> 00:28:28.000
We end up with 1/sqrt(3), 1/sqrt(3), and 1/sqrt(3). So, now, we are almost there. Our matrix p that we were looking for. It is going to be precisely the vectors that we found. This, and the other two normalized vectors which we just created.
00:28:28.000 --> 00:28:54.000
So, we get p = -1/sqrt(2), 1/sqrt(2), 0, -1/sqrt(6), -1/sqrt(6), 2/sqrt(6), 1/sqrt(3), 1/sqrt(3), 1/sqrt(3).
00:28:54.000 --> 00:29:18.000
This matrix with these three columns is... if I did my calculations... if I took the inverse of this matrix and if I multiplied by my original matrix, and then I multiplied by this matrix, I end up with this d, which is -2, -2, 4.
00:29:18.000 --> 00:29:30.000
The Eigenvalue's along the main diagonal, 0's everywhere else, and if you actually check this out, it will confirm that this is the case.
00:29:30.000 --> 00:29:50.000
When I have a symmetric n by n matrix, I run through the process of diagonalization, but not only do I just diagonalize it, but I can orthogonally diagonalize it by using this orthogonal matrix, which is orthogonal... which means everything is orthonormal and they are mutually orthogonal to each other. Their dot product = 0.
00:29:50.000 --> 00:30:10.000
I multiply p inverse ap, I get my diagonal matrix which is the Eigenvalues along the main diagonal. Notice the repeats... -2, -2, 4, so I have an Eigenspace of 2-dimensions, I have an Eigenspace of 1-dimension, which matches perfectly because my original matrix was 3-dimensional... r3.
00:30:10.000 --> 00:30:14.000
Thank you for joining us for the diagonalization of symmetric matrices, we will see you next time. Bye-bye.