当前位置:网站首页>Deep learning - LSTM
Deep learning - LSTM
2022-06-30 07:45:00 【Hair will grow again without it】
LSTM(long short term memory)
It allows you to learn very deep connections in the sequence ,LSTM Long short memory network , Even better than GRU More effective
GRU and LSTM
Memory cells c , Use𝑐̃<𝑡> = 𝑡𝑎𝑛ℎ(𝑊𝑐[𝑎<𝑡−1>, 𝑥<𝑡>] + 𝑏𝑐To update its candidate values𝑐̃<𝑡>Pay attention , stay LSTM We no longer have 𝑎<𝑡> = 𝑐<𝑡> The situation of , This is what we are using now, which is similar to the formula on the left , But there are some changes , Now we specialize in 𝑎<𝑡> perhaps 𝑎<𝑡−1>, Rather than using 𝑐<𝑡−1>, We don't have to 𝛤𝑟, Close the door . Like before, there was one Update door 𝛤𝑢 And parameters representing the update 𝑊𝑢,𝛤𝑢 = 𝜎(𝑊𝑢[𝑎<𝑡−1>, 𝑥<𝑡>] + 𝑏𝑢), use Oblivion gate (the forget gate), We call it the𝛤𝑓, So this𝛤𝑓 = 𝜎(𝑊𝑓[𝑎<𝑡−1>, 𝑥<𝑡>] + 𝑏𝑓), new Output gate ,𝛤𝑜 = 𝜎(𝑊𝑜[𝑎<𝑡−1>, 𝑥<𝑡>]+> 𝑏𝑜); therefore Memory cells The update value of𝑐<𝑡> = 𝛤𝑢 ∗ 𝑐̃<𝑡> + 𝛤𝑓 ∗ 𝑐<𝑡−1>, This gives memory cells the option to maintain the old values 𝑐<𝑡−1> Or add a new value 𝑐̃<𝑡>, So a separate update door is used here 𝛤𝑢 And forget the door 𝛤𝑓, Last 𝑎<𝑡> = 𝑐<𝑡> The formula of becomes𝑎<𝑡> = 𝛤𝑜 ∗ 𝑐<𝑡>.
LSTM
In this picture is use 𝑎<𝑡−1>, 𝑥<𝑡> Let's calculate the forgetting gate 𝛤𝑓 Value , And update the door 𝛤𝑢 And output gate 𝛤𝑜. Then they Also through tanh Function to calculate 𝑐̃<𝑡>, These values are combined in a complex way , For example, the product corresponding to the element or other ways to start from the previous 𝑐<𝑡−1> gain 𝑐<𝑡>.
What you see in this pile of pictures , Connect them together , It's about connecting them in chronological order , Input 𝑥<1>, then 𝑥<2>,𝑥<3>, Then you can connect the units one by one , here Output the last time 𝑎,𝑎 Will be used as input for the next time step ,𝑐 Empathy . You'll notice that there's a line up here , This line shows As long as you set the forgetting gate and updating gate correctly ,LSTM It is quite easy to put 𝑐<0> The value of is passed down to the right , such as 𝑐<3> = 𝑐<0>. That's why LSTM and GRU Very good at long-term memory of a certain value , For a value that exists in memory cells , Even after a long, long time step .
Q:GRU and LSTM Which is better to choose ?
A:GRU The advantage of this is that it is A simpler model , So it's easier to create a bigger network , And it only has two doors , stay It also runs faster computationally , Then it can scale up the model .
however LSTM More powerful and flexible , Because it has three doors instead of two . If you want to choose one to use , In my submission LSTM It's a priority in the history process , So if you have to choose one , I feel Today most people still put LSTM Try... As a default choice . Although I think in recent years GRU Got a lot of support , And I feel more and more teams are using GRU, Because it's simpler , And it works well , It's easier to adapt to bigger problems .
边栏推荐
猜你喜欢
随机推荐
Investment and financing analysis report of Supply Chain & logistics industry in 2021
Wangbohua: development situation and challenges of photovoltaic industry
Directory of software
November 21, 2021 [reading notes] - bioinformatics and functional genomics (Chapter 5 advanced database search)
C. Fishingprince Plays With Array
深度学习——残差网络ResNets
January 23, 2022 [reading notes] - bioinformatics and functional genomics (Chapter 6: multiple sequence alignment)
深度学习——序列模型and数学符号
Cadence innovus physical implementation series (I) Lab 1 preliminary innovus
2021.11.20 [reading notes] | differential variable splicing events and DTU analysis
C51 minimum system board infrared remote control LED light on and off
期末複習-PHP學習筆記6-字符串處理
25岁,从天坑行业提桶跑路,在经历千辛万苦转行程序员,属于我的春天终于来了
Projection point of point on line
深度学习——循环神经网络
Examen final - notes d'apprentissage PHP 6 - traitement des chaînes
STM32 infrared communication
At the age of 25, I started to work in the Tiankeng industry with buckets. After going through a lot of hardships to become a programmer, my spring finally came
November 22, 2021 [reading notes] - bioinformatics and functional genomics (Chapter 5, section 4, hidden Markov model)
Installation software operation manual (continuous update)











