当前位置:网站首页>深度学习——LSTM基础
深度学习——LSTM基础
2022-07-05 03:33:00 【冠long馨】
1. RNN
记住所有信息。
(1)定义与特性
RNN之所以在时序数据上有着优异的表现是因为RNN在t时间片时会将t-1时间片的隐节点作为当前时间片的输入。
(2)问题
- 长期依赖问题:随着数据时间片的增加,RNN丧失了学习连接如此远的信息的能力。
- 梯度消失:产生梯度消失和梯度爆炸是由于RNN的权值矩阵循环相乘导致的。
LSTM之所以能够解决RNN的长期依赖问题,是因为LSTM引入了门(gate)机制用于控制特征的流通和损失。
2. LSTM
(1)定义与特性
设置记忆细胞,选择性的记忆。
- 三个门:遗忘门、输入门、输出门
- 两个状态:C(t), h(t)
(2)前向传播
选择性的保留历史记忆,吸收新的知识
- 遗忘门 f t f_t ft:
① f t = σ ( W x f x t + W h f h t − 1 + b f ) ; f_t=\sigma(W_{xf}x_t+W_{hf}h_{t-1}+b_f); ft=σ(Wxfxt+Whfht−1+bf);
②理解: f t f_t ft通过sigmoid函数选择记忆(遗忘)历史信息 C t − 1 C_{t-1} Ct−1。
可以想象,脑容量是有限的。让输入新的信息时就需要选择性的遗忘一些作用不强的历史记忆。
- 输入门 i t i_t it:
① i t = σ ( W x i x t + W h i h t − 1 + b i ) ; i_t=\sigma(W_{xi}x_t+W_{hi}h_{t-1}+b_i); it=σ(Wxixt+Whiht−1+bi);
理解: i t i_t it通过sigmoid选择性的学习新的信息 g t g_t gt。
② g t = tanh ( W x g x t + W h g h t − 1 + b g ) g_t=\tanh(W_{xg}x_t+W_{hg}h_{t-1}+b_g) gt=tanh(Wxgxt+Whght−1+bg)
新的输入信息并不是全部有用的,我们只需要记住相关的信息。
- 历史信息 c t c_t ct:
① c t = f t ⊙ c t − 1 + g t ∗ i t ; c_t=f_t \odot c_{t-1}+g_t*i_t; ct=ft⊙ct−1+gt∗it;
理解:新的记忆是由之前的记忆和新获知的信息组成。其中 f t , i t f_t,i_t ft,it分别是对历史记忆和信息的筛选。
选择性地结合历史记忆和新信息,形成了新的记忆。
输出门 o t o_t ot:
① o t = σ ( W x o x t + W h o h t − 1 + b o ) ; o_t=\sigma(W_{xo}x_t+W_{ho}h_{t-1}+b_o); ot=σ(Wxoxt+Whoht−1+bo);
理解: o t o_t ot通过sigmoid选择性的运用记忆 tanh ( C t ) \tanh(C_t) tanh(Ct)。
② m t = tanh ( c t ) ; m_t=\tanh(c_t); mt=tanh(ct);
理解: C t C_t Ct通过tanh运用历史记忆。
③ h t = o t ⊙ m t ; h_t=o_t \odot m_t; ht=ot⊙mt;得到的 h t h_t ht会输出和用于下一个事件步t+1中。输出 y t y_t yt:
① y t = W y h h t + b y ; y_t = W_{yh}h_t+b_y; yt=Wyhht+by;
(3)理解
①使用 σ \sigma σ函数 f t , g t f_t,g_t ft,gt选择性的记忆历史信息 C t − 1 C_{t-1} Ct−1和学习新的知识 g t g_t gt。
c t = f t ⊙ c t − 1 + g t ∗ i t ; c_t=f_t \odot c_{t-1}+g_t*i_t; ct=ft⊙ct−1+gt∗it;②使用 σ \sigma σ函数 o t o_t ot筛选历史记忆 C t C_t Ct作为短期记忆 h t h_t ht。
h t = o t ⊙ m t ; h_t=o_t \odot m_t; ht=ot⊙mt;向前传播的过程:
LSTM通过三个门两个状态实现长短期记忆。首先通过记忆门 f t f_t ft选择记忆历史信息 C t − 1 C_{t-1} Ct−1,然后通过学习门 g t g_t gt选择性学习新的信息 i t i_t it。将筛选获得的新旧记忆相加获得新的历史记忆 C t C_t Ct。最后通过输出门 o t o_t ot选择性接收历史信息获得短期记忆 h t h_t ht。将短期记忆输入到输出中获得输出值 y t y_t yt。
边栏推荐
- The perfect car for successful people: BMW X7! Superior performance, excellent comfort and safety
- Asemi rectifier bridge 2w10 parameters, 2w10 specifications, 2w10 characteristics
- Mongodb common commands
- Design and practice of kubernetes cluster and application monitoring scheme
- [groovy] string (string splicing | multi line string)
- Port, domain name, protocol.
- Ubantu disk expansion (VMware)
- Single box check box
- Anchor free series network yolox source code line by line explanation Part 2 (a total of 10, ensure to explain line by line, after reading, you can change the network at will, not just as a participan
- SFTP cannot connect to the server # yyds dry goods inventory #
猜你喜欢
Simple use of devtools
Anchor free series network yolox source code line by line explanation Part 2 (a total of 10, ensure to explain line by line, after reading, you can change the network at will, not just as a participan
[groovy] string (string splicing | multi line string)
Acwing game 58 [End]
Leetcode92. reverse linked list II
C file in keil cannot be compiled
Linux安装Redis
About MySQL database connection exceptions
Linux Installation redis
Zero foundation uses paddlepaddle to build lenet-5 network
随机推荐
Monitoring web performance with performance
Is there any way to change the height of the uinavigationbar in the storyboard without using the UINavigationController?
Yyds dry goods inventory embedded matrix
[groovy] loop control (number injection function implements loop | times function | upto function | downto function | step function | closure can be written outside as the final parameter)
Jd.com 2: how to prevent oversold in the deduction process of commodity inventory?
Simple use of devtools
El tree whether leaf node or not, the drop-down button is permanent
SQL performance optimization skills
Kubernetes - identity and authority authentication
Easy processing of ten-year futures and stock market data -- Application of tdengine in Tongxinyuan fund
LeetCode 237. Delete nodes in the linked list
Sqoop安装
[105] Baidu brain map - Online mind mapping tool
Ask, does this ADB MySQL support sqlserver?
Voice chip wt2003h4 B008 single chip to realize the quick design of intelligent doorbell scheme
Acwing game 58 [End]
Pat class a 1160 forever (class B 1104 forever)
Devtools的簡單使用
Kubernetes -- cluster expansion principle
What is the most effective way to convert int to string- What is the most efficient way to convert an int to a String?