当前位置:网站首页>GRU神经网络
GRU神经网络
2022-07-28 18:18:00 【李峻枫】
RNN的局限性
在处理时序数据时候,RNN是一种十分简单的方法,但是它并不完美。它会记住太多的信息。
回忆一下能够状态函数 H t H_t Ht,它对每个进入的 X t X_t Xt都记录了。然而实际上并不需要完全记住这么多信息,就像一句话中的“的”之类的词就是意义不大的。记录了这些无用的信息反而会对预测产生误导。
此外还存在的一种特殊情况:概念漂移,随着时间的推移数据分布发送了变化。
面对这种情况,过去记录的状态信息可能不管用了,而且会产生误导,需要将其遗忘。
GRU神经网络
针对上述两种缺陷,GRU神经网络应运而生,它通过增设两个“门”,来实现的。
重置门
通过这个门就可以有效的解决概念漂移的问题。
R t = Θ ( X t ⋅ W x r + H t − 1 ⋅ W h r + b r ) R_t=\Theta\left( X_t\cdot W_{xr} + H_{t-1}\cdot W_{hr} + b_r \right) Rt=Θ(Xt⋅Wxr+Ht−1⋅Whr+br)
R t R_t Rt与 H t − 1 H_{t-1} Ht−1对应元素相乘,并于 X t X_t Xt对于元素相加,就得到了候选状态 H t ′ H'_t Ht′。
考虑两种极端情况:
- R t R_t Rt中全部是 1 1 1,也就是说记住之前全部的历史信息$H_{t-1}。
- R t R_t Rt中全部是 0 0 0,也就是说遗忘掉全部的历史信息 H t − 1 H_{t-1} Ht−1,即重置。
W , B W,B W,B是需要学习的权重,它们负责判断在哪些状态下需要遗忘(重置)说明。
更新门
在重置门中,产生了有关候选状态 H t ′ H'_t Ht′,它还需要经过更新门才能变为真正的状态 H t H_t Ht。
Z t = Θ ( X t ⋅ W x z + H t − 1 ⋅ W h z + b z ) H t = Z t ∗ H t − 1 + ( 1 − Z t ) ∗ H t ′ Z_t=\Theta\left( X_t\cdot W_{xz} + H_{t-1}\cdot W_{hz} + b_z \right) \newline H_t =Z_t*H_{t-1}+\left( 1 - Z_t\right)*H'_t Zt=Θ(Xt⋅Wxz+Ht−1⋅Whz+bz)Ht=Zt∗Ht−1+(1−Zt)∗Ht′
- 此处的
*表示对应元素相乘
通过这个公式,就可以看出来, Z t Z_t Zt就是觉得当前状态更新多少到 H t H_t Ht中。
代码实现
pytorch中有提供GRU神经网络层,直接调用即可。
nn.GRU(vocab_size , hidden_size)
边栏推荐
- C language implementation of strncpy
- 树行表达方式
- File lookup and file permissions
- 3、 Are formal and actual parameters in a programming language variables?
- DSACTF7月re
- Store and guarantee rancher data based on Minio objects
- Can China make a breakthrough in the future development of the meta universe and occupy the highland?
- 进制及数的表示 2
- Digital filter design matlab
- 跨区域网络的通信学习静态路由
猜你喜欢

CM4 development cross compilation tool chain production

zfoo增加类似于mydog的路由

一文让你搞懂什么是TypeScript

A chip company fell in round B

How to use pycharm to quickly create a flask project
![[C language] random number generation and `include < time. H > 'learning](/img/bb/3e47bf2e3b25653d9048884d65cda3.png)
[C language] random number generation and `include < time. H > 'learning

C语言数据 3(1)

Item exception handling in SSM
![[experiment sharing] CCIE BGP reflector experiment](/img/e4/1ddd611c8438cb6ca1be32f34fa67a.png)
[experiment sharing] CCIE BGP reflector experiment

C语言简单实例 1
随机推荐
HSETNX KEY_ Name field value usage
[in depth study of 4g/5g/6g topics -44]: urllc-15 - in depth interpretation of 3GPP urllc related protocols, specifications and technical principles -9-low delay technology -3-non slot scheduling mini
[C language] shutdown game [loop and switch statement]
[C language] header file of complex number four operations and complex number operations
2、 Relationship between software operation and memory
Practice of real-time push demo of three web messages: long polling, iframe and SSE
[C language] guessing numbers game [function]
C language pointer and two-dimensional array
长轮询,iframe和sse三种web消息实时推送demo实践
MySQL command statement (personal summary)
Implementation of memcpy in C language
How can Plato obtain premium income through elephant swap in a bear market?
[C language] Hanoi Tower problem [recursion]
Reverse string
[C language] random number generation and `include < time. H > 'learning
Common commands of raspberry pie
熊市下PLATO如何通过Elephant Swap,获得溢价收益?
9. Pointer of C language (1) what is pointer and how to define pointer variables
Array method added in ES6
Token verification program index.php when configuring wechat official account server