当前位置:网站首页>Deep learning - LSTM Foundation
Deep learning - LSTM Foundation
2022-07-05 03:44:00 【Guan Longxin】
1. RNN
Remember all the information .
(1) Definition and characteristics
RNN The reason why it has excellent performance in time series data is RNN stay t Time slice will t-1 The hidden node of the time slice is used as the input of the current time slice .
(2) problem
- Long term dependence : With the increase of data time slice ,RNN Lost the ability to learn to connect information so far .
- The gradient disappears : Gradient disappearance and gradient explosion are caused by RNN Caused by the cyclic multiplication of the weight matrix .
LSTM The reason why RNN Long term dependence , Because LSTM Door introduced (gate) Mechanisms are used to control the circulation and loss of features .
2. LSTM
(1) Definition and characteristics
Set up memory cells , Selective memory .
- Three doors : Oblivion gate 、 Input gate 、 Output gate
- Two states :C(t), h(t)
(2) Forward propagation
Selectively retain historical memory , Absorb new knowledge
- Oblivion gate f t f_t ft:
① f t = σ ( W x f x t + W h f h t − 1 + b f ) ; f_t=\sigma(W_{xf}x_t+W_{hf}h_{t-1}+b_f); ft=σ(Wxfxt+Whfht−1+bf);
② understand : f t f_t ft adopt sigmoid Function selection memory ( Forget ) Historical information C t − 1 C_{t-1} Ct−1.
As you can imagine , Brain capacity is limited . When inputting new information, we need to selectively forget some weak historical memories .
- Input gate i t i_t it:
① i t = σ ( W x i x t + W h i h t − 1 + b i ) ; i_t=\sigma(W_{xi}x_t+W_{hi}h_{t-1}+b_i); it=σ(Wxixt+Whiht−1+bi);
understand : i t i_t it adopt sigmoid Selectively learn new information g t g_t gt.
② g t = tanh ( W x g x t + W h g h t − 1 + b g ) g_t=\tanh(W_{xg}x_t+W_{hg}h_{t-1}+b_g) gt=tanh(Wxgxt+Whght−1+bg)
New input information is not all useful , We just need to remember the relevant information .
- Historical information c t c_t ct:
① c t = f t ⊙ c t − 1 + g t ∗ i t ; c_t=f_t \odot c_{t-1}+g_t*i_t; ct=ft⊙ct−1+gt∗it;
understand : New memory is composed of previous memory and newly learned information . among f t , i t f_t,i_t ft,it They are the screening of historical memory and information .
Selectively combine historical memory with new information , Formed a new memory .
Output gate o t o_t ot:
① o t = σ ( W x o x t + W h o h t − 1 + b o ) ; o_t=\sigma(W_{xo}x_t+W_{ho}h_{t-1}+b_o); ot=σ(Wxoxt+Whoht−1+bo);
understand : o t o_t ot adopt sigmoid Selective use of memory tanh ( C t ) \tanh(C_t) tanh(Ct).
② m t = tanh ( c t ) ; m_t=\tanh(c_t); mt=tanh(ct);
understand : C t C_t Ct adopt tanh Using historical memory .
③ h t = o t ⊙ m t ; h_t=o_t \odot m_t; ht=ot⊙mt; Got h t h_t ht Will be output and used for the next event step t+1 in .Output y t y_t yt:
① y t = W y h h t + b y ; y_t = W_{yh}h_t+b_y; yt=Wyhht+by;
(3) understand
① Use σ \sigma σ function f t , g t f_t,g_t ft,gt Selective memory of historical information C t − 1 C_{t-1} Ct−1 And learn new knowledge g t g_t gt.
c t = f t ⊙ c t − 1 + g t ∗ i t ; c_t=f_t \odot c_{t-1}+g_t*i_t; ct=ft⊙ct−1+gt∗it;② Use σ \sigma σ function o t o_t ot Filter historical memory C t C_t Ct As a short-term memory h t h_t ht.
h t = o t ⊙ m t ; h_t=o_t \odot m_t; ht=ot⊙mt;The process of spreading forward :
LSTM Realize long-term and short-term memory through three gates and two states . First, through the memory gate f t f_t ft Choose to remember historical information C t − 1 C_{t-1} Ct−1, Then through the learning door g t g_t gt Selective learning of new information i t i_t it. Add the old and new memories obtained through screening to obtain new historical memories C t C_t Ct. Finally, through the output gate o t o_t ot Selectively receive historical information to obtain short-term memory h t h_t ht. Input the short-term memory into the output to obtain the output value y t y_t yt.
边栏推荐
- 【做题打卡】集成每日5题分享(第三期)
- The perfect car for successful people: BMW X7! Superior performance, excellent comfort and safety
- 程序员的视力怎么样? | 每日趣闻
- Clickhouse物化视图
- [2022 repair version] community scanning code into group activity code to drain the complete operation source code / connect the contract free payment interface / promote the normal binding of subordi
- Asemi rectifier bridge 2w10 parameters, 2w10 specifications, 2w10 characteristics
- @Transactional 注解导致跨库查询失效的问题
- About MySQL database connection exceptions
- Smart pointer shared_ PTR and weak_ Difference of PTR
- Dart series: collection of best practices
猜你喜欢
[an Xun cup 2019] not file upload
Talk about the SQL server version of DTM sub transaction barrier function
[2022 repair version] community scanning code into group activity code to drain the complete operation source code / connect the contract free payment interface / promote the normal binding of subordi
Learning notes of raspberry pie 4B - IO communication (I2C)
[wp]bmzclub几道题的writeup
[wp][入门]刷弱类型题目
SPI and IIC communication protocol
Zero foundation uses paddlepaddle to build lenet-5 network
Blue Bridge Cup single chip microcomputer -- PWM pulse width modulation
Yuancosmic ecological panorama [2022 latest]
随机推荐
Performance of calling delegates vs methods
Analysis of glibc strlen implementation mode
Learning notes of raspberry pie 4B - IO communication (I2C)
LeetCode 234. Palindrome linked list
How to make the listbox scroll automatically when adding a new item- How can I have a ListBox auto-scroll when a new item is added?
特殊版:SpreadJS v15.1 VS SpreadJS v15.0
040. (2.9) relieved
MySQL winter vacation self-study 2022 11 (10)
How to learn to get the embedding matrix e # yyds dry goods inventory #
Clean up PHP session files
Jd.com 2: how to prevent oversold in the deduction process of commodity inventory?
Redis6-01nosql database
【软件逆向-分析工具】反汇编和反编译工具
ActiveReportsJS 3.1 VS ActiveReportsJS 3.0
speed or tempo in classical music
There is a question about whether the parallelism can be set for Flink SQL CDC. If the parallelism is greater than 1, will there be a sequence problem?
VM in-depth learning (XXV) -class file overview
[105] Baidu brain map - Online mind mapping tool
[software reverse - basic knowledge] analysis method, assembly instruction architecture
Is there any way to change the height of the uinavigationbar in the storyboard without using the UINavigationController?