当前位置:网站首页>Deep learning - LSTM
Deep learning - LSTM
2022-06-30 07:45:00 【Hair will grow again without it】
LSTM(long short term memory)
It allows you to learn very deep connections in the sequence ,LSTM Long short memory network , Even better than GRU More effective
GRU and LSTM
Memory cells c , Use𝑐̃<𝑡> = 𝑡𝑎𝑛ℎ(𝑊𝑐[𝑎<𝑡−1>, 𝑥<𝑡>] + 𝑏𝑐To update its candidate values𝑐̃<𝑡>Pay attention , stay LSTM We no longer have 𝑎<𝑡> = 𝑐<𝑡> The situation of , This is what we are using now, which is similar to the formula on the left , But there are some changes , Now we specialize in 𝑎<𝑡> perhaps 𝑎<𝑡−1>, Rather than using 𝑐<𝑡−1>, We don't have to 𝛤𝑟, Close the door . Like before, there was one Update door 𝛤𝑢 And parameters representing the update 𝑊𝑢,𝛤𝑢 = 𝜎(𝑊𝑢[𝑎<𝑡−1>, 𝑥<𝑡>] + 𝑏𝑢), use Oblivion gate (the forget gate), We call it the𝛤𝑓, So this𝛤𝑓 = 𝜎(𝑊𝑓[𝑎<𝑡−1>, 𝑥<𝑡>] + 𝑏𝑓), new Output gate ,𝛤𝑜 = 𝜎(𝑊𝑜[𝑎<𝑡−1>, 𝑥<𝑡>]+> 𝑏𝑜); therefore Memory cells The update value of𝑐<𝑡> = 𝛤𝑢 ∗ 𝑐̃<𝑡> + 𝛤𝑓 ∗ 𝑐<𝑡−1>, This gives memory cells the option to maintain the old values 𝑐<𝑡−1> Or add a new value 𝑐̃<𝑡>, So a separate update door is used here 𝛤𝑢 And forget the door 𝛤𝑓, Last 𝑎<𝑡> = 𝑐<𝑡> The formula of becomes𝑎<𝑡> = 𝛤𝑜 ∗ 𝑐<𝑡>.
LSTM
In this picture is use 𝑎<𝑡−1>, 𝑥<𝑡> Let's calculate the forgetting gate 𝛤𝑓 Value , And update the door 𝛤𝑢 And output gate 𝛤𝑜. Then they Also through tanh Function to calculate 𝑐̃<𝑡>, These values are combined in a complex way , For example, the product corresponding to the element or other ways to start from the previous 𝑐<𝑡−1> gain 𝑐<𝑡>.
What you see in this pile of pictures , Connect them together , It's about connecting them in chronological order , Input 𝑥<1>, then 𝑥<2>,𝑥<3>, Then you can connect the units one by one , here Output the last time 𝑎,𝑎 Will be used as input for the next time step ,𝑐 Empathy . You'll notice that there's a line up here , This line shows As long as you set the forgetting gate and updating gate correctly ,LSTM It is quite easy to put 𝑐<0> The value of is passed down to the right , such as 𝑐<3> = 𝑐<0>. That's why LSTM and GRU Very good at long-term memory of a certain value , For a value that exists in memory cells , Even after a long, long time step .
Q:GRU and LSTM Which is better to choose ?
A:GRU The advantage of this is that it is A simpler model , So it's easier to create a bigger network , And it only has two doors , stay It also runs faster computationally , Then it can scale up the model .
however LSTM More powerful and flexible , Because it has three doors instead of two . If you want to choose one to use , In my submission LSTM It's a priority in the history process , So if you have to choose one , I feel Today most people still put LSTM Try... As a default choice . Although I think in recent years GRU Got a lot of support , And I feel more and more teams are using GRU, Because it's simpler , And it works well , It's easier to adapt to bigger problems .
边栏推荐
- Similarities and differences of differential signal, common mode signal and single ended signal (2022.2.14)
- Xiashuo think tank: 42 reports on planet update today (including 23 planning cases)
- December 19, 2021 [reading notes] - bioinformatics and functional genomics (Chapter 5 advanced database search)
- Commands and permissions for directories and files
- Deep learning - goal orientation
- Final review -php learning notes 9-php session control
- Basic theory of four elements and its application
- Account command and account authority
- 期末複習-PHP學習筆記5-PHP數組
- 2022.01.20 [bug note] | qiime2: an error was encoded while running dada2 in R (return code 1)
猜你喜欢

Graphic explanation pads update PCB design basic operation

Global digital industry strategy and policy observation in 2021 (China Academy of ICT)

Efga design open source framework openlane series (I) development environment construction

The counting tool of combinatorial mathematics -- generating function

Common sorting methods

National technology n32g45x series about timer timing cycle calculation

Examen final - notes d'apprentissage PHP 6 - traitement des chaînes

深度学习——Bounding Box预测

深度学习——循环神经网络

Virtual machine VMware: due to vcruntime140 not found_ 1.dll, unable to continue code execution
随机推荐
right four steps of SEIF SLAM
【花雕体验】14 行空板pinpong库测试外接传感器模块(之一)
Solve the linear equation of a specified point and a specified direction
Shell command, how much do you know?
National technology n32g45x series about timer timing cycle calculation
【笔记】Polygon mesh processing 学习笔记(10)
Armv8 (coretex-a53) debugging based on openocd and ft2232h
February 14, 2022 [reading notes] - life science based on deep learning Chapter 2 Introduction to deep learning (Part 1)
Palindrome substring, palindrome subsequence
November 9, 2020 [wgs/gwas] - whole genome analysis (association analysis) process (Part 2)
Commands and permissions for directories and files
Xiashuo think tank: 125 planet updates reported today (packed with 101 meta universe collections)
Recurrence relation (difference equation) -- Hanoi problem
深度学习——Bounding Box预测
Combinatorial mathematics Chapter 2 Notes
2022 Research Report on China's intelligent fiscal and tax Market: accurate positioning, integration and diversity
Cross compile opencv3.4 download cross compile tool chain and compile (3)
Basic operation command
深度学习——LSTM
期末复习-PHP学习笔记4-PHP自定义函数


