For more information, please see full course syllabus of Multivariable Calculus

For more information, please see full course syllabus of Multivariable Calculus

### The Chain Rule

- The composition F(C(t)) is formed by taking the components of C and subsituting them for x and y.

^{2}.

^{2}+ y

^{2}.

- The composition F(C(t)) is formed by taking the components of C and subsituting them for x and y.

^{2}+ (cos(t))

^{2}= 1.

^{t},t,[1/t] ). Find the composition F(C(t)) given F(x,y,z) = 2x

^{2}− xyz + y

^{2}.

- The composition F(C(t)) is formed by taking the components of C and subsituting them for x, y and z.

^{t}, y = t and y = [1/t], we have F(C(t)) = 2( e

^{t})

^{2}− (e

^{t})(t)( [1/t] ) + (t)

^{2}= 2e

^{2t}− e

^{t}+ t

^{2}.

^{2}+ e

^{3t}− 1.

- Since our function is of one variable, we find F′(C(t)) by taking the derivative in respect to t.

^{3t}.

Evaluate F′(C(p)).

- First we find F(C(t)), let x = √t and y = ( √t , − √t ).
- Then F(C(t)) = cos( √t ( − √t ) ) = cos( − t) = cos(t).

^{2}y + yz and C(t) = ( t,cos(t),e

^{t}sin(t) ).

Evaluate F′(C(t))

- Since our functions are complex, we use F′(C(t)) = ∇F(C(t)) ×C′(t) to find the derivative.
- Computing ∇F(x,y,z) yields ∇F(x,y,z) = (2xy,x
^{2}+ z, − y), so ∇F(C(t)) = (2tcos(t),t^{2}+ e^{t}sin(t), − cos(t)). - Computing C′(t) yields C′(t) = (1, − sin(t),e
^{t}cos(t) + e^{t}sin(t)). Note that to compute [ e^{t}sin(t) ]′ we apply the product rule. - Thus F′(C(t)) = ∇F(C(t)) ×C′(t) = (2tcos(t),t
^{2}+ e^{t}sin(t), − cos(t)) ×(1, − sin(t),e^{t}cos(t) + e^{t}sin(t)) = 2tcos(t) − t^{2}sin(t) − e^{t}sin^{2}(t) − e^{t}cos^{2}(t) − e^{t}cos(t)sin(t).

^{2}sin(t) − e

^{t}cos(t)sin(t) − e

^{t}.

^{2}ln(2x + y) and C(t) = ( sin(t),cos(t) ).

Evaluate F′(C( [p/2] ))

- First we find F′(C(t)), using F′(C(t)) = ∇F(C(t)) ×C′(t) to find the derivative.
- Computing ∇F(x,y) yields ∇F(x,y) = ( [(2x
^{2})/(2x + y)] + 2xln(2x + y),[(x^{2})/(2x + y)] ), so ∇F(C(t)) = ( [(2sin^{2}(t))/(2sin(t) + cos(t))] + 2sin(t)ln(2sin(t) + cos(t)),[(sin^{2}(t))/(2sin(t) + cos(t))] ). Note that to compute [ x^{2}ln(2x + y) ]′ we use the product rule. - Computing C′(t) yields C′(t) = (cos(t), − sin(t)).
- Thus F′(C(t)) = ∇F(C(t)) ×C′(t) = ( [(2sin
^{2}(t))/(2sin(t) + cos(t))] + 2sin(t)ln(2sin(t) + cos(t)),[(sin^{2}(t))/(2sin(t) + cos(t))] ) ×(cos(t), − sin(t)). - We can simplify our scalar product by inputing t = [p/2], so F′(C( [p/2] )) = ( [(2sin
^{2}( [p/2] ))/(2sin( [p/2] ) + cos( [p/2] ))] + 2sin( [p/2] )ln(2sin( [p/2] ) + cos( [p/2] )),[(sin^{2}( [p/2] ))/(2sin( [p/2] ) + cos( [p/2] ))] ) ×(cos( [p/2] ), − sin( [p/2] )) = ( 1 + 2ln(2),[1/2] ) ×(0, − 1).

^{2},√t ,t ). Evaluate F′(C(t))

- We let x = pt
^{2}, y = √t and z = t and find F(C(t)). - So F(C(t)) = tcos(pt
^{2})sin( √t ). To find F′(C(t)) we apply the product rule twice. - So F′(C(t)) = [ tcos(pt
^{2}) ]( − cos( √t )( [1/(2√t )] ) ) + [ t( sin(pt^{2})(2pt) ) + cos(pt^{2}) ]sin( √t ).

^{2})cos( √t ))/2] + 2pt

^{2}sin(pt

^{2})sin( √t ) + cos(pt

^{2})sin( √t ).

^{2}+ y

^{2}} and C(t) = ( e

^{2t},e

^{ − 2t}). Evaluate F′(C(0))

- We let x = e
^{2t}, y = e^{ − 2t}and find F(C(t)). - So F(C(t)) = √{( e
^{2t})^{2}+ ( e^{ − 2t})^{2}} = √{e^{4t}+ e^{ − 4t}} . We can now compute F′(C(t)). Note that √{e^{4t}+ e^{ − 4t}} = ( e^{4t}+ e^{ − 4t})^{[1/2]}. - Thus F′(C(t)) = [1/2]( e
^{4t}+ e^{ − 4t})^{ − [1/2]}( 4e^{4t}− 4e^{ − 4t}).

^{4(0)}+ e

^{ − 4(0)})

^{ − [1/2]}( 4e

^{4(0)}− 4e

^{ − 4(0)}) = 0.

^{2}+ 1} + √{y

^{2}− 1} and C(t) = (t,t).Evaluate F′(C(2))

- We let x = t, y = t and find F(C(t)).
- So F(C(t)) = √{t
^{2}+ 1} + √{t^{2}− 1} . We can now compute F′(C(t)). Note that √{t^{2}+ 1} + √{t^{2}− 1} = ( t^{2}+ 1 )^{[1/2]}+ ( t^{2}− 1 )^{[1/2]} - Thus F′(C(t)) = [2t/2]( t
^{2}+ 1 )^{ − [1/2]}+ [2t/2]( t^{2}− 1 )^{ − [1/2]}= [t/(√{t^{2}+ 1} )] + [t/(√{t^{2}− 1} )].

^{2}+ 1} )] + [2/(√{2

^{2}− 1} )] = [2/(√5 )] + [2/(√3 )].

*These practice questions are only helpful when you work on them offline on a piece of paper and then use the solution steps function to check your answer.

Answer

### The Chain Rule

Lecture Slides are screen-captured images of important points in the lecture. Students can download and print out these lecture slide images to do practice problems as well as take notes while watching the lecture.

- Intro 0:00
- The Chain Rule 0:45
- Conceptual Example
- Example 1
- The Chain Rule
- Example 2: Part 1
- Example 2: Part 2 - Solving Directly

### Multivariable Calculus

### Transcription: The Chain Rule

*Hello and welcome back to educator.com and multi-variable calculus.*0000

*Today we are going to talk about the chain rule, and you remember from single-variable calculus the chain rule just allowed you to differentiate functions that were composite functions.*0005

*Composite functions where something like sin(x ^{3}), something like that, you took the derivative of the sine, then you took the derivative of what was inside the argument, the x^{3}, that became 3x^{2}.*0013

*Now that we are dealing with these vector value functions, these functions of several variables and we have introduced the gradient, we can actually bring those tools to bare on differentiating a function, a composite function that involves functions of several variables.*0027

*So let us just jump right on in, and let me give a quick description of what it is that is going on.*0045

*Again, what we want to do is not just jump right into the mathematics, we do not just want to write symbols on a page, we want to be able to understand what is happening.*0052

*When there is understanding, when you can see what is going on, you can use the intuition that you have already developed to decide what goes next and where to go. That is the whole idea.*0060

*We want you to understand what is happening mathematically before you actually do the mathematics.*0071

*That is the easy part, the mathematics will come, that is just symbolic manipulation but we want to see what is going on.*0076

*Let us start in R2, let us start in the plane, again we are using our geometric intuition to help us guide our mathematics.*0082

*So, let us say we have something like this. Now let us say we have some region in... you know, here... so now, let us suppose the following things.*0090

*So suppose f is a function from R2 to R, so a function of 2 variables, defined on the open set u. This is u by the way. Open set u.*0104

*Now, also suppose that C, which is a function from R to R2 -- we are dealing with R2, okay -- R to R2, is a curve in 2-space, that passes through u.*0125

*Let us say this is just something like that. So passes through u. Now, here is what is great.*0163

*The points along the curve, the function itself is defined on this open set.*0172

*In other words the points in this set can be used for this function f.*0177

*Well, the curve happens to pass through u, so the point on the curve can be used in f.*0182

*That is what we are doing, so what we can do is we can actually form the composite function, what we have is some curve that has nothing to do with u and yet it happens to pass through u.*0191

*We also have a function that is defined on u because there is this overlap we can use the... in the function f, we can use the points along the curve.*0200

*That is what is really, really great here.*0211

*Let us go ahead and write this down, and then the points along c(t) can be used by the function f.*0215

*In other words, we can form the composite function of c, which is f(c).*0240

*Let me write it with the actual t... oh this is a capital F by the way, sorry about that... small f's big F's, hm.*0270

*F(c(t)), that is it, it is a composite function. Except now, instead of a composite function of single variables, it is a composite function of multi-variables.*0278

*In this case, we are dealing with 2. Let us just do a quick example, and then we will discuss it just a little bit more. Just to make sure we understand this concept.*0289

*This is a profoundly important concept, so like the gradient we definitely want to have a good sense of what is going on here.*0298

*We will take the time to make sure that that is the case.*0307

*Let us just do a quick example. Just so that we can see... so example 1.*0311

*Now, let c(t), let the curve equal the following: let go t ^{3} then t^{2}, so t^{3} and t^{2}.*0319

*Now we will let our f, capital F of (x,y), of two variables, let us call it ln(y) and let us do cos(xy). How is that?*0332

*Just a... you know... nice random function. Now, f(c(t)). Notice. F is a function of two variables x and y, so its argument contains 2 things.*0345

*c(t) has two things, for x we put in the t ^{3}, for y we put in the t^{2}, into here and we form the function F as a function of t.*0366

*Watch what happens. F(c(t)) = f(t ^{3},t^{2}), right?*0380

*f(x,y) that just means these 2 things, so wherever I see a y I put in whatever is in here, wherever I see an x I put in what is in here.*0392

*Well F(c(t)) is this, it actually spits out 2 values, those 2 values, this is x, this is y.*0400

*We end up getting the logarithm of t ^{2} × cos... natural logarithm × the cosine(t^{3}t^{2}, which equals ln(t^{2}) × cos(t^{5}).*0409

*That is it, that is all I have done here. I have formed this composite function now, with 2 different functions that mapped to different spaces, but the space where one maps to is exactly the space that the next function needs as its domain in order to take the next step.*0432

*Let me actually write down this whole idea of the functions again, so C(t) is a map from R to R2.*0450

*In other words, it takes numbers, real numbers, and it spits out 2 vectors, a point in 2 space.*0466

*F(x,y) it takes a vector, a point in 2-space, and it spits out a number.*0474

*So what I have really done here, the composite function actually ends up being a map from R to R. That is what is happening here.*0487

*That is what is important to see. You can jump around from space to space, that is what makes multi-variable calculus so unbelievably powerful. 0826 That you can actually jump around from space to space like this, with well defined functions.*0499

*This is going to be the x,y plane, this is R2. I will write you another copy of the real number line.*0513

*So, this is the real number line, this is R2, 2-space, and this is the real number line.*0520

*C(t) maps from here to here.*0527

*A point to a vector... a vector in 2-space.*0535

*F goes from here, takes a point in 2-space, and maps to here, so this is c(t) and this is f.*0540

*When we find the composite function, f(c(t)), what I have now is a map from R to R. That is what this example is.*0549

*I have this that goes to a point in 2-space. F takes a point in 2-space and spits out some number.*0566

*Noticed I ended up with some function of t, ln(t ^{2})t^{5} a specific value of t. This is just a single number.*0572

*This is what is going on here. You are just forming a composite function with curves and functions of several variables.*0583

*Hopefully this is reasonably clear. This is what we want to understand. This is what is happening mathematically.*0593

*You are mapping from one space to another, and then you are moving from that space to another space. In this case the space you end up with happens to be the space you started off with, which is the real number line.*0596

*So now the chain rule allows us to differentiate something like this. So now, let us go ahead and explicitly write down what the chain rule is.*0607

*I want you to see this simply because I want you also to start becoming accustomed to the expression of theorems, formal things, but again, it has to be based on understanding.*0618

*It is a little long, but there is nothing here that is strange, so let f be a function defined and differentiable on an open set u.*0630

*Let c be a differentiable curve, all that means is a nice smooth curve with no whacky bumps or corners... be a differentiable curve such that the values of c(t) lie in the open set u.*0666

*What we were doing in the beginning of the lesson, we have an open set, we have a curve that happens to pass through that open set, therefore the points along the curve can be used for our function.*0713

*Then, the composite function, f(c(t)) is differentiable.*0725

*It is differentiable itself... as a function of t and the derivative of f(c(t)) with respect to t is equal to the gradient of f evaluated at c(t). 1250 The dot product of that vector with the vector c'(t), now you remember the gradient is a vector.*0738

*If I have some function, like f(x,y), the gradient is df/dx, and the second component is df/dy, I just differentiate as many variables, and that is my gradient vector.*0777

*So let us stop and think about what this says. If I have some function that is defined and differentiable on some open set, and c happens to be a differentiable curve that passes through that open set, in other words take some values in that open set, then the composite function f(c(t)) is also differentiable.*0794

*It is differentiable as a function of t, as a single variable t and the derivative of that composite function is equal to the gradient of f at c(t) · c'(t).*0812

*This is very, very, very important.*0825

*Now, for computations, when we actually do specific problems, we of course are going to be working with components, which is always the case.*0831

*With vectors we can go ahead and write out the definitions and the theorems using a shorter, more elegant notation, but when we actually do the computations with vectors we have to work with components.*0841

*x,y,z, whatever it is that we happen to be working with. So, let us go ahead and just sort of write out the component form of this so you see what is happening.*0852

*Again, the dot product is the same dot product that you know. There is nothing new here, trust what you know.*0860

*This is a vector, this is a vector, when you take the dot product of 2 vectors you get a number. That is what this says. You are getting a derivative.*0868

*A function of t, if you evaluated a specific point of t, it is actually just a number. You are still just doing a derivative. The same thing you have been doing for years.*0877

*Let us just see here. So, if c(t) equals, now c _{1}, that is the... so c(t) is a curve... c_{2}(t), its component functions are component functions of t, just like the first example, and we have f(x_{1},x_{2}).*0886

*This time I did not write it as x and y, I wrote it as x _{1} and x_{2}, these are variables.*0920

*The first variable, the second variable. Then the derivative with respect to t of the f(c(t)) = well, we said it equals the gradient of f evaluated at c(t) · c'(t).*0924

*Okay, the gradient -- I should probably write this out -- the gradient of this function is going to be... tell you what, let me go ahead and before I write that, let me write out the gradient because I know it has been a couple of lessons since we did that.*0949

*So, let me write the gradf = df/dx _{1}, df/dx_{2}.*0968

*This is a vector, the first component of which is the derivative with respect to the first variable.*0982

*The second component is the derivative of the function with respect to the second variable.*0988

*Now, c'(t) is also a vector. It is the derivative of this, c _{1}'(t) and it is the derivative of this, c_{2}'(t).*0991

*That is it, these are just functions, so now what we have is the derivative with respect to t of f(c(t)), in other words this thing right here.*1006

*We said it is the gradient of f · c', this is the gradient of f, this is c'.*1018

*So let us see what this looks like in component form. It is... oh, you know what, I have a capital F, don't I?*1025

*I keep forgetting that, that small f is just so ubiquitous in most scientific literature.*1032

*So, we have df/dx _{1} × dc_{1}/dt, that is all this is, c' is just dc_{1}/dt.*1039

*It is just notation. dc _{2}/dt. The dot product is this × that + this × that + df/dx_{2} × dc_{2}/dt.*1062

*So that is it, that is all we are doing here. We are just doing it in component form.*1090

*Now personally, I think that what I have just written here is actually a little bit more confusing than just the statement of the theorem.*1096

*If you look at it as just the statement of the theorem, the gradient of f dotted with c', and if you know what the gradient is, you know what c' is, you know how to take derivatives.*1105

*You just do the dot product. This is sort of the component representation of it.*1114

*I personally do not like seeing all of these things because again it is notationally intensive.*1121

*The idea is to understand what this is, and then you can do the rest.*1125

*So, personally, my favorite, I still think it is great to learn it this way. Gradient · c'. Gradient of f · c'. Just keep telling yourself that about 5 or 6 times, and you will know what to do.*1130

*So, let us go ahead and just do an example, that is the best way to make sense of this.*1145

*So, let me go back to my black ink here. Actually you know what, let me go ahead and go to blue.*1151

*So example 2. Now, we will let our curve t be t, e(t) and t ^{2}, and we will let our function x, y, z, so we are definitely talking about a curve in 3-space, and a function of 3 variables, equals xy^{2}z.*1157

*First of all, let us talk about what is going on here. We are going to form the composite function. We are going to be forming f(c(t)).*1193

*That is what we are going to be doing. Well, f(c(t)), x,y,z, in this case x is this thing, t, y is this thing e(t), and z is this t ^{2}.*1201

*We want to write everything out. Now the gradient of f, that is it, we are just going to build this step by step by step, that is all we are doing here.*1220

*The gradient of f is equal to, well it is the first component is the first partial, the second component is the second partial, the third component is the third partial.*1231

*If you like the other notation it is going to be df/dx, df/dy, and it is going to be df/dz.*1244

*Now, let us go ahead and actually compute that. The first partial, the derivative of this function with respect to x is y ^{2}z.*1254

*The derivative with respect to y is 2xyz.*1264

*And, the derivative with respect to z is going to be xy ^{2}, so this is my gradient of f.*1271

*Now, my gradient of f, evaluated at c(t), so now we will take the next step, now we will do the grad of f evaluated at c(t) which is the actual expression that is in the definition for the chain rule.*1284

*All that says is that take my gradient f, this thing, and I just put in the values c(t) in here.*1300

*Well, x is t, y is e(t), and z is t ^{2}.*1308

*So when I put these things into here, here is what I get.*1312

*y ^{2} is just, so it is going to end up being t^{2}e(2t), right?*1323

*y ^{2} is just e(2t), z is t^{2}, so that is t^{2}e(2t).*1329

*2 × x, which is t, y which is e(t), z which is t ^{2}, I end up with 2t^{3}e(t).*1336

*xy ^{2} is t × e(t) × t^{2}, I get t × e of... wait, e(2t), yes, there we go...*1346

*Okay, so that takes care of this one. That is the grad of f evaluated at c(t), now I just have to find c'.*1362

*That is really, really simple. C'(t). Well here is my c right here, I will just take the derivative of each one, that is it.*1369

*The derivative of t is 1, the derivative of e(t) is e ^{t}, and the derivative of t^{2} is 2t.*1378

*Well, now I just form my dot product, so the gradient of f evaluated at c(t) dotted with c'(t), it equals this vector dotted with this vector.*1390

*Well the dot product is this × that, so it is t ^{2}, the dot product is not a vector, this × that so it is t^{2}e(2t) + this × that, 2t^{3}e(2t) + this × that + 2t^{2}e(2t), and that is it.*1411

*Let us see if there is anything that I can combine here, t ^{2}e(2t), 2t^{2}e(2t), yes, there is.*1446

*So it is going to be 3t ^{2}e(2t) + 2 × t^{3}e(2t), and that is my final answer. That is it. Let me go back.*1453

*I was given some, you know, a curve, and I was given a function, and I just hammered it out.*1469

*I took the gradient as a function, as a vector in x, y, z, I evaluated it at c(t), in other words I put these values in for x, y, z, and I got this.*1480

*Now it is the gradient vector expressed in t. I took the derivative of c which is c', that is easiest enough to do.*1490

*Then I just took the dot product of those vectors. That is it, that is all that is going on here.*1498

*So this happens to be the derivative of f(c(t)). That is what this is equal to. This is equal to f(c(t))... no, we want to definitely, f'(c(t)), that is it. That is all that is going on here.*1501

*It is just a way of differentiating a composite function.*1523

*Now, you are probably asking yourself, Okay, well if I start with a function of t, t goes from R to R3, and then I take the function R3 to R so that it is a function R to R, what I actually have is a function of t, right?*1526

*Yes, it is just a function of t that you are differentiating. You are saying, well wait a minute, if I found f(t) up here, couldn't I just do this directly? Do I have to use the chain rule?*1542

*The answer is no, you do not have to use the chain rule, you can do it directly. Which one is better?*1552

*Well, actually it depends on the situation. It depends on the function, it depends on what you are doing, that is all it is.*1556

*So, let us go ahead and actually do it directly just to confirm that you can do it directly.*1562

*I think it will shed a little bit more light on this relationship between the curve and the function.*1566

*Let us do this in blue again... so we said that f(x,y,z) is equal to xy ^{2}z.*1574

*We said that c(t), let us just rewrite them over again.*1589

*Let us see, what did we say c(t) was... t, e(t), and t ^{2}, so now let us just form f(c(t)).*1595

*So f(c(t) = well, xy ^{2}z, x is t, so that is t, y^{2} is e(2t), and z is going to be t^{2}, I am just plugging those in, and I end up with t^{3}e(2t).*1606

*Well, not if I just take... this is just a function of t, so if I just take df with respect to t, I end up getting, so it is going to be this × the derivative of that, which is going to be 2t ^{3}e(2t) + that × the derivative of that, 3t^{2}e(2t).*1633

*What do you know, you end up with the same exact answer.*1656

*Which is better? Again it just depends on your particular situation.*1660

*Sometimes you want to do it directly if it makes more sense, sometimes you want to use the chain rule if it makes more sense.*1667

*The problem at hand we will actually decide which one is better.*1672

*Ok, so that is the chain rule, thank you very much for joining us here at educator.com. We will see you next time. Take care, bye-bye.*1676

3 answers

Last reply by: Professor Hovasapian

Wed Mar 13, 2013 5:34 AM

Post by Yujin Jung on March 11, 2013

Hello!

Do you by any chance teach the 'Jacobian matrix' of a function?

Our lecturer referred to it as he was teaching the chain rule and was wondering how they relate to each other..

Thank you :)

0 answers

Post by Caleb Lear on October 11, 2012

I like your method for using the chain rule, I wonder if there's a way to adapt it for implicit differentiation? Right now I'm just deriving as before and multiplying by a partial where I need to.