当前位置:网站首页>Deep learning - LSTM
Deep learning - LSTM
2022-06-30 07:45:00 【Hair will grow again without it】
LSTM(long short term memory)
It allows you to learn very deep connections in the sequence ,LSTM Long short memory network , Even better than GRU More effective
GRU and LSTM
Memory cells c , Use𝑐̃<𝑡> = 𝑡𝑎𝑛ℎ(𝑊𝑐[𝑎<𝑡−1>, 𝑥<𝑡>] + 𝑏𝑐To update its candidate values𝑐̃<𝑡>Pay attention , stay LSTM We no longer have 𝑎<𝑡> = 𝑐<𝑡> The situation of , This is what we are using now, which is similar to the formula on the left , But there are some changes , Now we specialize in 𝑎<𝑡> perhaps 𝑎<𝑡−1>, Rather than using 𝑐<𝑡−1>, We don't have to 𝛤𝑟, Close the door . Like before, there was one Update door 𝛤𝑢 And parameters representing the update 𝑊𝑢,𝛤𝑢 = 𝜎(𝑊𝑢[𝑎<𝑡−1>, 𝑥<𝑡>] + 𝑏𝑢), use Oblivion gate (the forget gate), We call it the𝛤𝑓, So this𝛤𝑓 = 𝜎(𝑊𝑓[𝑎<𝑡−1>, 𝑥<𝑡>] + 𝑏𝑓), new Output gate ,𝛤𝑜 = 𝜎(𝑊𝑜[𝑎<𝑡−1>, 𝑥<𝑡>]+> 𝑏𝑜); therefore Memory cells The update value of𝑐<𝑡> = 𝛤𝑢 ∗ 𝑐̃<𝑡> + 𝛤𝑓 ∗ 𝑐<𝑡−1>, This gives memory cells the option to maintain the old values 𝑐<𝑡−1> Or add a new value 𝑐̃<𝑡>, So a separate update door is used here 𝛤𝑢 And forget the door 𝛤𝑓, Last 𝑎<𝑡> = 𝑐<𝑡> The formula of becomes𝑎<𝑡> = 𝛤𝑜 ∗ 𝑐<𝑡>.
LSTM
In this picture is use 𝑎<𝑡−1>, 𝑥<𝑡> Let's calculate the forgetting gate 𝛤𝑓 Value , And update the door 𝛤𝑢 And output gate 𝛤𝑜. Then they Also through tanh Function to calculate 𝑐̃<𝑡>, These values are combined in a complex way , For example, the product corresponding to the element or other ways to start from the previous 𝑐<𝑡−1> gain 𝑐<𝑡>.
What you see in this pile of pictures , Connect them together , It's about connecting them in chronological order , Input 𝑥<1>, then 𝑥<2>,𝑥<3>, Then you can connect the units one by one , here Output the last time 𝑎,𝑎 Will be used as input for the next time step ,𝑐 Empathy . You'll notice that there's a line up here , This line shows As long as you set the forgetting gate and updating gate correctly ,LSTM It is quite easy to put 𝑐<0> The value of is passed down to the right , such as 𝑐<3> = 𝑐<0>. That's why LSTM and GRU Very good at long-term memory of a certain value , For a value that exists in memory cells , Even after a long, long time step .
Q:GRU and LSTM Which is better to choose ?
A:GRU The advantage of this is that it is A simpler model , So it's easier to create a bigger network , And it only has two doors , stay It also runs faster computationally , Then it can scale up the model .
however LSTM More powerful and flexible , Because it has three doors instead of two . If you want to choose one to use , In my submission LSTM It's a priority in the history process , So if you have to choose one , I feel Today most people still put LSTM Try... As a default choice . Although I think in recent years GRU Got a lot of support , And I feel more and more teams are using GRU, Because it's simpler , And it works well , It's easier to adapt to bigger problems .
边栏推荐
- Use of nested loops and output instances
- Final review -php learning notes 6- string processing
- Given a fixed point and a straight line, find the normal equation of the straight line passing through the point
- 深度学习——残差网络ResNets
- Efga design open source framework openlane series (I) development environment construction
- Basic knowledge points
- 想转行,却又不知道干什么?此文写给正在迷茫的你
- Deep learning -- feature point detection and target detection
- 深度学习——BRNN和DRNN
- 深度学习——目标定位
猜你喜欢

Introduction notes to pytorch deep learning (11) neural network pooling layer

1162 Postfix Expression
![2021-10-27 [WGS] pacbio third generation methylation modification process](/img/a3/39d05e0daf4ea7eba95337b7a936b1.jpg)
2021-10-27 [WGS] pacbio third generation methylation modification process

2022 retail industry strategy: three strategies for consumer goods gold digging (in depth)

Analysis of cross clock transmission in tinyriscv
![Cadence physical library lef file syntax learning [continuous update]](/img/5a/b42269d80c13779762a8da67ba6989.jpg)
Cadence physical library lef file syntax learning [continuous update]

Investment and financing analysis report of Supply Chain & logistics industry in 2021

Self study notes -- use of 74h573

Shell command, how much do you know?

深度学习——LSTM
随机推荐
Deloitte: investment management industry outlook in 2022
2022 Research Report on China's intelligent fiscal and tax Market: accurate positioning, integration and diversity
Analysis of cross clock transmission in tinyriscv
min_ max_ Gray operator understanding
Efga design open source framework openlane series (I) development environment construction
2021 China Enterprise Cloud index insight Report
Projection point of point on line
Program acceleration
想转行,却又不知道干什么?此文写给正在迷茫的你
期末複習-PHP學習筆記5-PHP數組
2021-10-29 [microbiology] qiime2 sample pretreatment form automation script
Final review -php learning notes 3-php process control statement
Quick placement of devices by module in Ad
Inversion Lemma
1163 Dijkstra Sequence
Lodash filter collection using array of values
Multi whale capital: report on China's education intelligent hardware industry in 2022
期末复习-PHP学习笔记2-PHP语言基础
期末複習-PHP學習筆記3-PHP流程控制語句
Deep learning -- Realization of convolution by sliding window

