data:image/s3,"s3://crabby-images/7c6f0/7c6f04e181de9117b3ba3844982db7fb236450fc" alt="Hands-On Mathematics for Deep Learning"
上QQ阅读APP看书,第一时间看更新
Chain rule
Let's take an arbitrary function f that takes variables x and y as input, and there is some change in either variable so that . Using this, we can find the change in f using the following:
data:image/s3,"s3://crabby-images/424f6/424f666fa4de820a61f03ce66d1cf6526d6f95f7" alt=""
This leads us to the following equation:
data:image/s3,"s3://crabby-images/56aac/56aac7b16b40028feba5dddb985cca152c5cce33" alt=""
Then, by taking the limit of the function as , we can derive the chain rule for partial derivatives.
We express this as follows:
data:image/s3,"s3://crabby-images/93f31/93f31f7ce7c2b95e6c82227ed2bc759b446e8391" alt=""
We now divide this equation by an additional small quantity (t) on which x and y are dependent, to find the gradient along . The preceding equation then becomes this one:
data:image/s3,"s3://crabby-images/68fba/68fbad8ba17db08eb95457085e5482192006f048" alt=""
The differentiation rules that we came across earlier still apply here and can be extended to the multivariable case.