/sci/ - Science & Math » Thread #13293621

112KiB, 1144x750, ctwut.png

View Same Google iqdb SauceNAO

Anonymous Fri 18 Jun 19:17:38 2021 No.13293621 View Reply Original Report

Quoted By: >>13294290

Does anyone know how he got this for $\frac{partial C_t}{\partial C_{t-1}}$ ?
shouldn't it just be f?

https://weberna.github.io/blog/2017/11/15/LSTM-Vanishing-Gradients.html#fn:3

Anonymous

Anonymous Fri 18 Jun 2021 22:32:05 No.13294290 Report

Quoted By: >>13294315 >>13295601

>>13293621
It's not just f, it has to be a Jacobian because C_t and C_{t-1} are both vectors.
But you're close, it's a matrix that has the entries of the forget gate on its diagonal. So when applying the chain rule and backpropagating, you multiply this diagonal matrix with the previous gradient, which is equivalent to element-wise multiplication with the current forget gate.

Anonymous

Anonymous Fri 18 Jun 2021 22:36:40 No.13294315 Report

Quoted By:

>>13294290
cont.
>it's a matrix that has the entries of the forget gate on its diagonal
Why? Well, the Jacobian has the gradient with respect to the i-th component of C_t in row i. The i-th component of C_t is $(C_t)_i = (f_t)_i (C_{t-1})_i + (i_t)_i (C_t)_i$ . Then the partial derivatives with respect to the components of C_{t-1} are all zero except for the i-th component, where the partial derivative is the i-th forget gate component. Hence a diagonal matrix.

Btw, I would recommend you to use the weighted gradient sum variant of the chain rule, it makes things a lot less fucky for example when you're deriving by matrices.

Anonymous

Anonymous Sat 19 Jun 2021 03:48:45 No.13295601 Report

Quoted By:

>>13294290
Thank you for the response. I just realized, the equation should be:
$(C_t) = (f_t)\odot (C_{t-1}) + (i_t)\odot(C_t)$

Capcode	All Only User Posts Only Moderator Posts Only Admin Posts Only Developer Posts
Show Posts	All Only With Images Only Without Images
Deleted Posts	All Only Deleted Posts Only Non-Deleted Posts
Ghost Posts	All Only Ghost Posts Only Non-Ghost Posts
Post Type	All Only Sticky Threads Only Opening Posts Only Reply Posts
Results	All Grouped By Threads
Order	Latest Posts First Oldest Posts First

Your latest searches