当前位置:网站首页>pytorch学习记录(六):循环神经网络 RNN & LSTM
pytorch学习记录(六):循环神经网络 RNN & LSTM
2022-07-30 13:25:00 【狸狸Arina】
文章目录
1. 时间序列表示
1.1 word embedding
- pytorch只支持数值类型,不能支持string类型,那必须把string类型表示为数值类型,这种方法就叫做representation或者word embedding;

1.2 one-hot编码

1.3 word2vec
- one-hot 编码稀疏;
- 考虑到单词之间的相似性;常用的方法有word2vec, glove;

1.4 序列的batch表示
- [word num, b, word vec] 以时间戳表示;
- [b, word num. word vec] 以句子表示;

2. 循环神经网络
2.1 RNN 的形式


2.2 RNN Layer

2.2.1 nn.RNN


import torch
import torch.nn as nn
rnn = nn.RNN(50, 10)
print(rnn._parameters.keys())
print(rnn.weight_hh_l0.shape)
print(rnn.weight_ih_l0.shape)
print(rnn.bias_hh_l0.shape)
print(rnn.bias_ih_l0.shape)
''' odict_keys(['weight_ih_l0', 'weight_hh_l0', 'bias_ih_l0', 'bias_hh_l0']) torch.Size([10, 10]) torch.Size([10, 50]) torch.Size([10]) torch.Size([10]) '''
2.2.2 nn.RNNCell
- nn.RNNCell完成单个时间戳的单层的数据传递、预测;功能和n n.RNN一样,只是没有堆叠层和stack所有out的功能;


import torch
import torch.nn as nn
cell1 = nn.RNNCell(100, 50,)
cell2 = nn.RNNCell(50, 30)
input = torch.randn(3,3,100)
ht1 = torch.zeros(3,50)
ht2 = torch.zeros(3,30)
for x_cell in input:
ht1 = cell1(x_cell, ht1)
ht2 = cell2(ht1, ht2)
print(ht1.shape)
print(ht2.shape)
''' torch.Size([3, 50]) torch.Size([3, 30]) '''
2.3 时间序列预测
from cProfile import label
import torch
import torch.nn as nn
import numpy as np
from matplotlib import pyplot as plt
def generate_data():
num_time_steps = 60
#生成一个随机时间点开始的样本
start = np.random.randint(3, size = 1)[0] #随机初始化一个开始点
time_steps = np.linspace(start, start+10, num_time_steps)
data = np.sin(time_steps)
data = data.reshape(num_time_steps, 1) #数据长度为1
x = torch.tensor(data[:-1]).float().view(1, num_time_steps-1, 1) #添加一个batch维度为1
y = torch.tensor(data[1:]).float().view(1, num_time_steps-1, 1)
return x,y,time_steps
class Net(nn.Module):
def __init__(self, input_size, hidden_size, num_layers, output_size):
super().__init__()
self.hidden_size = hidden_size
self.output_size = output_size
self.rnn = nn.RNN(input_size, hidden_size, num_layers, batch_first = True)
self.linear = nn.Linear(hidden_size, output_size)
def forward(self, x, hidden_pre):
b = x.size(0)
out, hidden_pre = self.rnn(x, hidden_pre) # out[b, seq, hidden_size] hidden_pre = [b, num_layer, hidden_size]
out = out.view(-1, self.hidden_size) #[b*seq, hidden_size]
out = self.linear(out)#[b*seq, 1]
out = out.view(b, -1, self.output_size) #[b, seq, 1]
return out, hidden_pre
if __name__ == '__main__':
batch_size = 1
num_layers = 2
hidden_size = 10
model = Net(1, hidden_size, num_layers, 1)
criteron = nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(), lr = 0.5e-2)
hidden_pre = torch.zeros(num_layers, batch_size, hidden_size)
for iter in range(6000):
x,y,_ = generate_data()
out, hidden_pre = model(x, hidden_pre)
hidden_pre = hidden_pre.detach()
loss = criteron(out, y)
optimizer.zero_grad()
loss.backward()
optimizer.step()
if iter %100 == 0:
print('iteration:{} loss:{}'.format(iter, loss.item()))
x,y,time_steps = generate_data()
input = x[:,0,:].unsqueeze(1) #取所有batch的序列的第一个数据 [b, 1, word_dim]
h = torch.zeros(num_layers, batch_size, hidden_size)
predictions = []
for _ in range(x.shape[1]): #遍历序列
pred, h = model(input, h)
input = pred
predictions.append(pred.detach().numpy().ravel()[0])
figure = plt.figure(figsize=(20,20), dpi = 80)
plt.scatter(time_steps[1:], predictions, label='pred')
plt.scatter(time_steps[1:], y.view(-1).numpy(), label = 'sin')
plt.legend()
plt.show()
''' iteration:0 loss:0.6097927689552307 iteration:100 loss:0.007763191591948271 iteration:200 loss:0.0011507287854328752 iteration:300 loss:0.00087575992802158 iteration:400 loss:0.0005032330518588424 iteration:500 loss:0.0004986028652638197 iteration:600 loss:0.0009817895479500294 iteration:700 loss:0.00040510689723305404 iteration:800 loss:0.0010686117457225919 '''
2.4 RNN训练难题
- 对于长序列数据,会出现梯度爆炸,或者梯度弥散的情况;

2.4.1 梯度爆炸
- Cliping:梯度大于阈值的时候,除以它自己的范数,使得梯度范数变为1;


2.4.2 梯度弥散

- LSTM 解决梯度弥散;
2.5 LSTM
- RNN只能保存当前单词的附近的语境,对于离得远的单词或者前面的一些单词就给忘记了(short-term menmory);
- LSTM可以记住特别长的时间序列,所以叫long-short-term menmory;

- RNN 展开形式

- LSTM 展开形式


- 梯度更计算的时候不会出现w^k 的情况,并且梯度计算是几项累加,几乎不会同时出现都很小,或者都很大的情况,所以避免了梯度等于0,也就是梯度弥散;

2.6 LSTM使用
2.6.1 nn.LSTM


lstm = nn.LSTM(100, 20, 4)
c = torch.zeros(4, 30, 20)
h = torch.zeros(4, 30, 20)
x = torch.rand(80, 30, 100)
out, (h,c) = lstm(x, (h,c))
print(out.shape)
print(h.shape)
print(c.shape)
''' torch.Size([80, 30, 20]) torch.Size([4, 30, 20]) torch.Size([4, 30, 20]) '''
2.6.2 nn.LSTMCell




边栏推荐
- CF1677E Tokitsukaze and Beautiful Subsegments
- libudev 使用说明书
- R语言ggplot2可视化:使用ggpubr包的ggmaplot函数可视化MA图(MA-plot)、设置label.select参数自定义在图中显示标签的基因类型(自定义显示的标签列表)
- Mac Brew 安装PHP
- 无代码开发平台应用可见权限设置入门教程
- EasyNVS云管理平台功能重构:支持新增用户、修改信息等
- shell script flow control statement
- BUUCTF刷题十一道(06)
- 二手手机销量突破3亿部,与降价的iPhone夹击国产手机
- odoo--qweb模板介绍(一)
猜你喜欢

一本通循环结构的程序设计第一章题解(1)

C语言学习练习题:汉诺塔(函数与递归)

The way of programmers' cultivation: do one's own responsibilities, be clear in reality - lead to the highest realm of pragmatism

“封号斗罗” 程序员修炼之道:通向务实的最高境界

腾讯称电竞人才缺口200万;华为鸿蒙3.0正式发布;乐视推行每周工作4天半?...丨黑马头条...

Why is Prometheus a monitoring artifact sufficient to replace Zabbix?

重保特辑|筑牢第一道防线,云防火墙攻防演练最佳实践

一文读懂Elephant Swap,为何为ePLATO带来如此高的溢价?

TaskDispatcher source code parsing

学习笔记——七周成为数据分析师《第一周:数据分析思维》
随机推荐
Hu-cang integrated e-commerce project (1): project background and structure introduction
【河北工业大学】考研初试复试资料分享
大手笔!两所“双一流”大学,获75亿元重点支持!
58. 最后一个单词的长度
外包干了七年,废了。。。
jsArray array copy method performance test 2207292307
SQL 改写系列七:谓词移动
浅析TSINGSEE智能视频分析网关的AI识别技术及应用场景
[ARC092B] Two Sequences
群晖系统安装相关文件分享
SQL 26 calculation under 25 years of age or older and the number of users
ARC117E Zero-Sum Ranges 2
CF1677E Tokitsukaze and Beautiful Subsegments
20220729 Securities, Finance
人社部公布“数据库运行管理员”成新职业,OceanBase参与制定职业标准
Dolphinscheduler stand-alone transformation
cpu/CS and IP
EasyNVS云管理平台功能重构:支持新增用户、修改信息等
Learning notes - 7 weeks as data analyst "in the first week: data analysis of thinking"
CF338E Optimize!