当前位置:网站首页>[pytorch] LSTM neural network
[pytorch] LSTM neural network
2022-07-28 20:29:00 【Li Junfeng】
Processing time series data , There has been a RNN Cyclic neural network and GRU neural network Two classic Networks . Of course, there is another LSTM neural network , Long and short term memory neural network .
From the perspective of development history , It is available LSTM There's more GRU Of , But in terms of complexity ,LSTM Than GRU More complicated .
Let's recall GRU, It has two doors ( Update doors and reset doors ), There is a vector that records historical information H t H_t Ht.
and LSTM It's more complicated , Whether in the number of doors or the vector recording historical information .
LSTM neural network
There are 3 A door ,2 Status .
Control door
Oblivion gate
This with GRU The reset door in is very similar , The meaning is also roughly the same .
F t = Θ ( X t ⋅ W x f + H t − 1 ⋅ W h f + b f ) F_t = \Theta\left(X_t\cdot W_{xf} + H_{t - 1}\cdot W_{hf} + b_f\right) Ft=Θ(Xt⋅Wxf+Ht−1⋅Whf+bf)
Then , F t F_t Ft Act on memory C t − 1 C_{t-1} Ct−1, Forget some historical information .
Input gate
It is associated with GRU The update door in is a little similar , But it's not exactly the same .
because LSTM There are two states , It can be understood as C t C_t Ct The renewal door of .
I t = Θ ( X t ⋅ W x i + H t − 1 ⋅ W h i + b i ) I_t = \Theta\left(X_t\cdot W_{xi} + H_{t - 1}\cdot W_{hi} + b_i\right) It=Θ(Xt⋅Wxi+Ht−1⋅Whi+bi)
Then it will act on candidate memory C t ′ C'_t Ct′, Update to get new C t C_t Ct.
Output gate
It is associated with GRU The update door in is a little similar , But it's not exactly the same .
because LSTM There are two states , It can be understood as H t H_t Ht The renewal door of .
O t = Θ ( X t ⋅ W x o + H t − 1 ⋅ W h o + b o ) O_t = \Theta\left(X_t\cdot W_{xo} + H_{t - 1}\cdot W_{ho} + b_o\right) Ot=Θ(Xt⋅Wxo+Ht−1⋅Who+bo)
Then it will act on candidate memory C t C_t Ct, Update to get new H t H_t Ht.
state
Memory state
You can see from the whole update process , C t − 1 C_{t-1} Ct−1 Forget some information first , And then with the candidate memory ( according to X t X_t Xt) The generated part of the information is merged , obtain C t C_t Ct.
Its change is relatively slow , It is also called long-term memory .
Hidden state
H t H_t Ht According to the current output ( X t X_t Xt And H t − 1 H_{t-1} Ht−1 Result ) And current memory C t C_t Ct The result of the action . Compared with C t C_t Ct, H t H_t Ht And H t − 1 H_{t-1} Ht−1 Weaker relationship , therefore H t H_t Ht Change faster . Therefore, it is also called short-term memory .
Combine the above two states : Long term memory and short-term memory , It is called short-term and long-term memory neural network .
Code implementation
pytorch It also provides for LSTM layer , It is very convenient to call .
But you need to define the initial state value yourself ( A binary ).
class LSMT_Net(nn.Module):
def __init__(self, vocab_size, hidden_size, **kwargs):
super(LSMT_Net, self).__init__(**kwargs)
self.vocab_size = vocab_size
self.hidden_size = hidden_size
self.LSMTlayer = nn.LSTM(vocab_size , hidden_size, num_layers= 2)
self.L1 = nn.Linear(hidden_size , vocab_size)
def forward(self, inputs, state):
X = F.one_hot(inputs.T.long(), self.vocab_size) # Turn into one and only one 1, The rest are 0 Vector
X = X.to(torch.float32)
Y , state = self.LSMTlayer(X , state)
Y = Y.reshape((-1 , Y.shape[-1]))
Y = self.L1(Y)
return Y , state
def begin_state(self , batch_size):
return (torch.zeros(self.LSMTlayer.num_layers , batch_size , self.hidden_size),
torch.zeros(self.LSMTlayer.num_layers , batch_size , self.hidden_size))
边栏推荐
- Raspberry pie CM4 -- using metartc3.0 to integrate ffmpeg to realize webrtc push-pull streaming
- DSACTF7月re
- Practice of real-time push demo of three web messages: long polling, iframe and SSE
- 5. Difference between break and continue (easy to understand version)
- JVM (24) -- performance monitoring and tuning (5) -- Analyzing GC logs
- C语言数据 3(2)
- Representation of base and number 2
- Torch. NN. Linear() function
- Raspberry pie uses the command line to configure WiFi connections
- [task01: getting familiar with database and SQL]
猜你喜欢

[dynamic link library (DLL) initialization example program failed "problem]

Merge sort template

Tree row expression

C语言数据 3(2)
![[C language] use function pointers to make a different calculator](/img/58/e6ba11e054d9e45ec979224ac3e4c6.png)
[C language] use function pointers to make a different calculator

LeetCode-297-二叉树的序列化与反序列化

读取json配置文件,实现数据驱动测试

Read JSON configuration file to realize data-driven testing

Residual network RESNET source code analysis - pytoch version

CM4 development cross compilation tool chain production
随机推荐
robobrowser的简单使用
Array out of bounds
数据挖掘(数据预处理篇)--笔记
[C language] string reverse order implementation (recursion and iteration)
WPF--实现WebSocket服务端
[detailed use of doccano data annotation]
Simple use of robobrowser
Scene thread allocation in MMO real-time combat games
同质化代币与 NFT 结合,如何使治理结构设计更灵活?
Communication learning static routing across regional networks
Linxu 【权限,粘滞位】
The privatized instant messaging platform protects the security of enterprise mobile business
9. Pointer of C language (3) classic program, exchange the value of two numbers for deep analysis, (easy to understand), are formal parameters and arguments a variable?
Music says
Basic mathematical knowledge (update)
[C language] Gobang game [array and function]
Richpedia: A Large-Scale, Comprehensive Multi-Modal Knowledge Graph
7. Functions of C language, function definitions and the order of function calls, how to declare functions, prime examples, formal parameters and arguments, and how to write a function well
Nocturnal simulator settings agent cannot be saved
HSETNX KEY_NAME FIELD VALUE 用法