2022-07-05 03:33:00 【冠long馨】
1. RNN
- 长期依赖问题:随着数据时间片的增加,RNN丧失了学习连接如此远的信息的能力。
- 梯度消失:产生梯度消失和梯度爆炸是由于RNN的权值矩阵循环相乘导致的。
- 三个门:遗忘门、输入门、输出门
- 两个状态:C(t), h(t)
- 遗忘门 f t f_t ft:
① f t = σ ( W x f x t + W h f h t − 1 + b f ) ; f_t=\sigma(W_{xf}x_t+W_{hf}h_{t-1}+b_f); ft=σ(Wxfxt+Whfht−1+bf);
②理解: f t f_t ft通过sigmoid函数选择记忆(遗忘)历史信息 C t − 1 C_{t-1} Ct−1。
- 输入门 i t i_t it:
① i t = σ ( W x i x t + W h i h t − 1 + b i ) ; i_t=\sigma(W_{xi}x_t+W_{hi}h_{t-1}+b_i); it=σ(Wxixt+Whiht−1+bi);
理解: i t i_t it通过sigmoid选择性的学习新的信息 g t g_t gt。
② g t = tanh ( W x g x t + W h g h t − 1 + b g ) g_t=\tanh(W_{xg}x_t+W_{hg}h_{t-1}+b_g) gt=tanh(Wxgxt+Whght−1+bg)
- 历史信息 c t c_t ct:
① c t = f t ⊙ c t − 1 + g t ∗ i t ; c_t=f_t \odot c_{t-1}+g_t*i_t; ct=ft⊙ct−1+gt∗it;
理解:新的记忆是由之前的记忆和新获知的信息组成。其中 f t , i t f_t,i_t ft,it分别是对历史记忆和信息的筛选。
输出门 o t o_t ot:
① o t = σ ( W x o x t + W h o h t − 1 + b o ) ; o_t=\sigma(W_{xo}x_t+W_{ho}h_{t-1}+b_o); ot=σ(Wxoxt+Whoht−1+bo);
理解: o t o_t ot通过sigmoid选择性的运用记忆 tanh ( C t ) \tanh(C_t) tanh(Ct)。
② m t = tanh ( c t ) ; m_t=\tanh(c_t); mt=tanh(ct);
理解: C t C_t Ct通过tanh运用历史记忆。
③ h t = o t ⊙ m t ; h_t=o_t \odot m_t; ht=ot⊙mt;得到的 h t h_t ht会输出和用于下一个事件步t+1中。输出 y t y_t yt:
① y t = W y h h t + b y ; y_t = W_{yh}h_t+b_y; yt=Wyhht+by;
①使用 σ \sigma σ函数 f t , g t f_t,g_t ft,gt选择性的记忆历史信息 C t − 1 C_{t-1} Ct−1和学习新的知识 g t g_t gt。
c t = f t ⊙ c t − 1 + g t ∗ i t ; c_t=f_t \odot c_{t-1}+g_t*i_t; ct=ft⊙ct−1+gt∗it;②使用 σ \sigma σ函数 o t o_t ot筛选历史记忆 C t C_t Ct作为短期记忆 h t h_t ht。
h t = o t ⊙ m t ; h_t=o_t \odot m_t; ht=ot⊙mt;向前传播的过程:
LSTM通过三个门两个状态实现长短期记忆。首先通过记忆门 f t f_t ft选择记忆历史信息 C t − 1 C_{t-1} Ct−1,然后通过学习门 g t g_t gt选择性学习新的信息 i t i_t it。将筛选获得的新旧记忆相加获得新的历史记忆 C t C_t Ct。最后通过输出门 o t o_t ot选择性接收历史信息获得短期记忆 h t h_t ht。将短期记忆输入到输出中获得输出值 y t y_t yt。
- Smart pointer shared_ PTR and weak_ Difference of PTR
- Leetcode42. connect rainwater
- 【做题打卡】集成每日5题分享(第三期)
- Azkaban installation and deployment
- Daily question 2 12
- Talk about the SQL server version of DTM sub transaction barrier function
- Unity implements the code of the attacked white flash (including shader)
- Basic authorization command for Curl
- Huawei MPLS experiment
- Pat class a 1160 forever (class B 1104 forever)
Blue Bridge Cup single chip microcomputer -- PWM pulse width modulation
Flume configuration 4 - customize mysqlsource
Leetcode92. reverse linked list II
Jd.com 2: how to prevent oversold in the deduction process of commodity inventory?
IPv6 experiment
[groovy] string (string splicing | multi line string)
Yyds dry goods inventory embedded matrix
SQL performance optimization skills
单项框 复选框
Talk about the SQL server version of DTM sub transaction barrier function
[groovy] string (string type variable definition | character type variable definition)
Is there any way to change the height of the uinavigationbar in the storyboard without using the UINavigationController?
040. (2.9) relieved
Anchor free series network yolox source code line by line explanation Part 2 (a total of 10, ensure to explain line by line, after reading, you can change the network at will, not just as a participan
Sqoop installation
Port, domain name, protocol.
Comparison of advantages and disadvantages between platform entry and independent deployment
LeetCode 237. Delete nodes in the linked list
v-if VS v-show 2.0
Azkaban actual combat
Design and practice of kubernetes cluster and application monitoring scheme
Smart pointer shared_ PTR and weak_ Difference of PTR
MySQL winter vacation self-study 2022 11 (9)
Binary heap implementation (priority queue implementation)
2021 Li Hongyi machine learning (2): pytorch
When sqlacodegen generates a model, how to solve the problem that the password contains special characters?
Linux Installation redis