当前位置：网站首页>Deep learning - LSTM

Deep learning - LSTM

2022-06-30 07:45:00 【Hair will grow again without it】

LSTM（long short term memory）

It allows you to learn very deep connections in the sequence ,LSTM Long short memory network , Even better than GRU More effective

GRU and LSTM

Memory cells c , Use 𝑐̃<𝑡> = 𝑡𝑎𝑛ℎ(𝑊𝑐[𝑎<𝑡−1>, 𝑥<𝑡>] + 𝑏𝑐 To update its candidate values 𝑐̃<𝑡> Pay attention , stay LSTM We no longer have 𝑎<𝑡> = 𝑐<𝑡> The situation of , This is what we are using now, which is similar to the formula on the left , But there are some changes , Now we specialize in 𝑎<𝑡> perhaps 𝑎<𝑡−1>, Rather than using 𝑐<𝑡−1>, We don't have to 𝛤𝑟, Close the door . Like before, there was one Update door 𝛤𝑢 And parameters representing the update 𝑊𝑢,𝛤𝑢 = 𝜎(𝑊𝑢[𝑎<𝑡−1>, 𝑥<𝑡>] + 𝑏𝑢), use Oblivion gate （the forget gate）, We call it the 𝛤𝑓, So this 𝛤𝑓 = 𝜎(𝑊𝑓[𝑎<𝑡−1>, 𝑥<𝑡>] + 𝑏𝑓), new Output gate ,𝛤𝑜 = 𝜎(𝑊𝑜[𝑎<𝑡−1>, 𝑥<𝑡>]+> 𝑏𝑜); therefore Memory cells The update value of 𝑐<𝑡> = 𝛤𝑢 ∗ 𝑐̃<𝑡> + 𝛤𝑓 ∗ 𝑐<𝑡−1>, This gives memory cells the option to maintain the old values 𝑐<𝑡−1> Or add a new value 𝑐̃<𝑡>, So a separate update door is used here 𝛤𝑢 And forget the door 𝛤𝑓, Last 𝑎<𝑡> = 𝑐<𝑡> The formula of becomes 𝑎<𝑡> = 𝛤𝑜 ∗ 𝑐<𝑡>.

LSTM

In this picture is use 𝑎<𝑡−1>, 𝑥<𝑡> Let's calculate the forgetting gate 𝛤𝑓 Value , And update the door 𝛤𝑢 And output gate 𝛤𝑜. Then they Also through tanh Function to calculate 𝑐̃<𝑡>, These values are combined in a complex way , For example, the product corresponding to the element or other ways to start from the previous 𝑐<𝑡−1> gain 𝑐<𝑡>.

What you see in this pile of pictures , Connect them together , It's about connecting them in chronological order , Input 𝑥<1>, then 𝑥<2>,𝑥<3>, Then you can connect the units one by one , here Output the last time 𝑎,𝑎 Will be used as input for the next time step ,𝑐 Empathy . You'll notice that there's a line up here , This line shows As long as you set the forgetting gate and updating gate correctly ,LSTM It is quite easy to put 𝑐<0> The value of is passed down to the right , such as 𝑐<3> = 𝑐<0>. That's why LSTM and GRU Very good at long-term memory of a certain value , For a value that exists in memory cells , Even after a long, long time step .

Q:GRU and LSTM Which is better to choose ？
A:GRU The advantage of this is that it is A simpler model , So it's easier to create a bigger network , And it only has two doors , stay It also runs faster computationally , Then it can scale up the model .
however LSTM More powerful and flexible , Because it has three doors instead of two . If you want to choose one to use , In my submission LSTM It's a priority in the history process , So if you have to choose one , I feel Today most people still put LSTM Try... As a default choice . Although I think in recent years GRU Got a lot of support , And I feel more and more teams are using GRU, Because it's simpler , And it works well , It's easier to adapt to bigger problems .

原网站

版权声明
本文为[Hair will grow again without it]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/181/202206300722173980.html

当前位置：网站首页>Deep learning - LSTM

Deep learning - LSTM

LSTM（long short term memory）

边栏推荐

猜你喜欢

随机推荐