当前位置:网站首页>Deep learning - LSTM Foundation
Deep learning - LSTM Foundation
2022-07-05 03:44:00 【Guan Longxin】
1. RNN
Remember all the information .
(1) Definition and characteristics
RNN The reason why it has excellent performance in time series data is RNN stay t Time slice will t-1 The hidden node of the time slice is used as the input of the current time slice .

(2) problem
- Long term dependence : With the increase of data time slice ,RNN Lost the ability to learn to connect information so far .
- The gradient disappears : Gradient disappearance and gradient explosion are caused by RNN Caused by the cyclic multiplication of the weight matrix .
LSTM The reason why RNN Long term dependence , Because LSTM Door introduced (gate) Mechanisms are used to control the circulation and loss of features .
2. LSTM
(1) Definition and characteristics
Set up memory cells , Selective memory .
- Three doors : Oblivion gate 、 Input gate 、 Output gate
- Two states :C(t), h(t)
(2) Forward propagation 
Selectively retain historical memory , Absorb new knowledge
- Oblivion gate f t f_t ft:
① f t = σ ( W x f x t + W h f h t − 1 + b f ) ; f_t=\sigma(W_{xf}x_t+W_{hf}h_{t-1}+b_f); ft=σ(Wxfxt+Whfht−1+bf);
② understand : f t f_t ft adopt sigmoid Function selection memory ( Forget ) Historical information C t − 1 C_{t-1} Ct−1.
As you can imagine , Brain capacity is limited . When inputting new information, we need to selectively forget some weak historical memories .
- Input gate i t i_t it:
① i t = σ ( W x i x t + W h i h t − 1 + b i ) ; i_t=\sigma(W_{xi}x_t+W_{hi}h_{t-1}+b_i); it=σ(Wxixt+Whiht−1+bi);
understand : i t i_t it adopt sigmoid Selectively learn new information g t g_t gt.
② g t = tanh ( W x g x t + W h g h t − 1 + b g ) g_t=\tanh(W_{xg}x_t+W_{hg}h_{t-1}+b_g) gt=tanh(Wxgxt+Whght−1+bg)
New input information is not all useful , We just need to remember the relevant information .
- Historical information c t c_t ct:
① c t = f t ⊙ c t − 1 + g t ∗ i t ; c_t=f_t \odot c_{t-1}+g_t*i_t; ct=ft⊙ct−1+gt∗it;
understand : New memory is composed of previous memory and newly learned information . among f t , i t f_t,i_t ft,it They are the screening of historical memory and information .
Selectively combine historical memory with new information , Formed a new memory .
Output gate o t o_t ot:
① o t = σ ( W x o x t + W h o h t − 1 + b o ) ; o_t=\sigma(W_{xo}x_t+W_{ho}h_{t-1}+b_o); ot=σ(Wxoxt+Whoht−1+bo);
understand : o t o_t ot adopt sigmoid Selective use of memory tanh ( C t ) \tanh(C_t) tanh(Ct).
② m t = tanh ( c t ) ; m_t=\tanh(c_t); mt=tanh(ct);
understand : C t C_t Ct adopt tanh Using historical memory .
③ h t = o t ⊙ m t ; h_t=o_t \odot m_t; ht=ot⊙mt; Got h t h_t ht Will be output and used for the next event step t+1 in .Output y t y_t yt:
① y t = W y h h t + b y ; y_t = W_{yh}h_t+b_y; yt=Wyhht+by;
(3) understand
① Use σ \sigma σ function f t , g t f_t,g_t ft,gt Selective memory of historical information C t − 1 C_{t-1} Ct−1 And learn new knowledge g t g_t gt.
c t = f t ⊙ c t − 1 + g t ∗ i t ; c_t=f_t \odot c_{t-1}+g_t*i_t; ct=ft⊙ct−1+gt∗it;② Use σ \sigma σ function o t o_t ot Filter historical memory C t C_t Ct As a short-term memory h t h_t ht.
h t = o t ⊙ m t ; h_t=o_t \odot m_t; ht=ot⊙mt;The process of spreading forward :
LSTM Realize long-term and short-term memory through three gates and two states . First, through the memory gate f t f_t ft Choose to remember historical information C t − 1 C_{t-1} Ct−1, Then through the learning door g t g_t gt Selective learning of new information i t i_t it. Add the old and new memories obtained through screening to obtain new historical memories C t C_t Ct. Finally, through the output gate o t o_t ot Selectively receive historical information to obtain short-term memory h t h_t ht. Input the short-term memory into the output to obtain the output value y t y_t yt.
边栏推荐
- SPI and IIC communication protocol
- [groovy] string (string injection function | asBoolean | execute | minus)
- Leetcode42. connect rainwater
- Asemi rectifier bridge 2w10 parameters, 2w10 specifications, 2w10 characteristics
- [software reverse - basic knowledge] analysis method, assembly instruction architecture
- [vérification sur le Web - divulgation du code source] obtenir la méthode du code source et utiliser des outils
- LeetCode 237. Delete nodes in the linked list
- Analysis of glibc strlen implementation mode
- A brief introduction to the behavior tree of unity AI
- 深度学习——LSTM基础
猜你喜欢

Mongodb common commands

Learning notes of raspberry pie 4B - IO communication (I2C)

The latest blind box mall, which has been repaired very popular these days, has complete open source operation source code

Subversive cognition: what does SRE do?
![[positioning in JS]](/img/f1/02ce74fadc1f7524c7abca9db66c71.jpg)
[positioning in JS]
![[C language] address book - dynamic and static implementation](/img/eb/07e7a32a172e5ae41457cf8a49c130.jpg)
[C language] address book - dynamic and static implementation

UE4 DMX和grandMA2 onPC 3.1.2.5的操作流程

Clickhouse物化视图

New interesting test applet source code_ Test available

Huawei MPLS experiment
随机推荐
How can we truncate the float64 type to a specific precision- How can we truncate float64 type to a particular precision?
Redis6-01nosql database
请问一下我的请求是条件更新,但在buffer中就被拦截了,这种情况我只能每次去flush缓存么?
MySQL winter vacation self-study 2022 11 (9)
[move pictures up, down, left and right through the keyboard in JS]
問下,這個ADB mysql支持sqlserver嗎?
天干地支纪年法中为什么是60年一个轮回,而不是120年
Web components series (VII) -- life cycle of custom components
Three line by line explanations of the source code of anchor free series network yolox (a total of ten articles, which are guaranteed to be explained line by line. After reading it, you can change the
Kubernetes - identity and authority authentication
v-if VS v-show 2.0
It took two nights to get Wu Enda's machine learning course certificate from Stanford University
Leetcode92. reverse linked list II
[luat-air105] 4.1 file system FS
Basic function learning 02
[punch in questions] integrated daily 5-question sharing (phase III)
Mongodb common commands
Talk about the SQL server version of DTM sub transaction barrier function
Ubantu disk expansion (VMware)
Analysis of glibc strlen implementation mode