$$\begin{equation}S_t = f(UX_t+WS_{t-1})\tag{1}\end{equation}$$$$\begin{equation}O_t=g(VS_t)\tag{2}\end{equation}$$其中,$f$和$g$为激活函数,$U,W,V$为RNN的参数。假如$T$时刻的loss为$L_T$,则反向流传时传递到$t$时刻的对于$W$的梯度为,$$\begin{equation}[\frac{\partial L_T}{\partial W}]_t^T=\frac{\par…