当前位置:网站首页>[secretly kill little partner pytorch20 days] - [day4] - [example of time series data modeling process]
[secretly kill little partner pytorch20 days] - [day4] - [example of time series data modeling process]
2022-06-30 05:43:00 【aJupyter】
System tutorial 20 Heaven takes Pytorch
Recently with Brother Zhong 、 Huige Do a little punch in ,20 God pytorch, This is the first 4 God . Welcome to one button and three links .
List of articles
2020 The outbreak of the novel coronavirus pneumonia in 2008 has caused many aspects of the lives of people of various countries. .
Some students are on income , Some students are emotional , Some students are psychological , There are also students who are overweight .
This paper is based on China 2020 year 3 Epidemic data before June , Establish time series RNN Model , China's novel coronavirus pneumonia outbreak is expected to end. .
import os
import datetime
import torchkeras
# Print time
def printbar():
nowtime = datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S')
print("\n"+"=========="*8 + "%s"%nowtime)
#mac On the system pytorch and matplotlib stay jupyter You need to change the environment variable when running in
os.environ["KMP_DUPLICATE_LIB_OK"]="TRUE"
One , Prepare the data
The data set of this paper is taken from tushare
Data set Overview 
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
%config InlineBackend.figure_format = 'svg'
df = pd.read_csv("/home/mw/input/data6936/eat_pytorch_data/data/covid-19.csv",sep = "\t")
df.plot(x = "date",y = ["confirmed_num","cured_num","dead_num"],figsize=(10,6))
plt.xticks(rotation=60) # Abscissa rotation 60°

dfdata = df.set_index("date")
dfdiff = dfdata.diff(periods=1).dropna() # The null value is deleted after the first-order difference , The null value is actually the first line
dfdiff = dfdiff.reset_index("date") # Cancel date Index identity
dfdiff.plot(x = "date",y = ["confirmed_num","cured_num","dead_num"],figsize=(10,6))
plt.xticks(rotation=60)
dfdiff = dfdiff.drop("date",axis = 1).astype("float32") # Delete time column , And convert to floating point

tips:
df = pd.DataFrame({
'month': [1, 4, 7, 10],
'year': [2012, 2014, 2013, 2014],
'sale': [55, 40, 84, 31]})
# Set a single column as an index
df.set_index('month')
''' year sale month 1 2012 55 4 2014 40 7 2013 84 10 2014 31 '''

Now let's inherit torch.utils.data.Dataset Implement custom time series data set .
torch.utils.data.Dataset Is an abstract class , Users who want to load custom data only need to inherit this class , And override two of them :
len: Realization len(dataset) Returns the size of the entire dataset .
getitem: Used to get some index data , send dataset[i] Returns the... In the dataset i Samples .
Not overriding these two methods will directly return an error .
import torch
from torch import nn
from torch.utils.data import Dataset,DataLoader,TensorDataset
# Use a day ago 8 The day window data is used as the input to predict the data of the day
WINDOW_SIZE = 8
class Covid19Dataset(Dataset):
def __len__(self):
return len(dfdiff) - WINDOW_SIZE
def __getitem__(self,i):
x = dfdiff.loc[i:i+WINDOW_SIZE-1,:]
feature = torch.tensor(x.values)
y = dfdiff.loc[i+WINDOW_SIZE,:]
label = torch.tensor(y.values)
return (feature,label)
ds_train = Covid19Dataset()
# The data is small , You can put all the training data into one batch in , Lifting performance
dl_train = DataLoader(ds_train,batch_size = 38)
import torch
from torch import nn
from torch.utils.data import Dataset,DataLoader,TensorDataset
# Use a day ago 8 The day window data is used as the input to predict the data of the day
WINDOW_SIZE = 8
class Covid19Dataset(Dataset):
def __len__(self):
return len(dfdiff) - WINDOW_SIZE
def __getitem__(self,i):
x = dfdiff.loc[i:i+WINDOW_SIZE-1,:]
feature = torch.tensor(x.values)
y = dfdiff.loc[i+WINDOW_SIZE,:]
label = torch.tensor(y.values)
return (feature,label)
ds_train = Covid19Dataset()
# The data is small , You can put all the training data into one batch in , Lifting performance
dl_train = DataLoader(ds_train,batch_size = 38)
Data processing summary
- time series data , It is to use the data of the previous time to predict the data of the later time
- Perform first-order difference on the data , Then remove the NaN value , structure dataset( Use the data of the first eight days as the training set )
Two 、 Defining models
Use Pytorch There are usually three ways to build models :
- Use nn.Sequential Build models in a hierarchical order
- Inherit nn.Module Base classes build custom models
- Inherit nn.Module The base class builds the model and assists in encapsulating the model container .
Choose the second way to build the model here .
Because the next training cycle in the form of class , We further encapsulate the model into torchkeras Medium Model Class to get something like Keras Functions of medium and high-order model interface .
Model Class actually inherits from nn.Module class .
import torch
from torch import nn
import importlib
import torchkeras
torch.random.seed()
class Block(nn.Module):
def __init__(self):
super(Block,self).__init__()
def forward(self,x,x_input):
x_out = torch.max((1+x)*x_input[:,-1,:],torch.tensor(0.0))
return x_out
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
# 3 layer lstm
self.lstm = nn.LSTM(input_size = 3,hidden_size = 3,num_layers = 5,batch_first = True)
self.linear = nn.Linear(3,3)
self.block = Block()
def forward(self,x_input):
x = self.lstm(x_input)[0][:,-1,:] # Do not use the length dimension of the sequence
x = self.linear(x)
y = self.block(x,x_input)
return y
net = Net()
model = torchkeras.Model(net) # Devil details
print(model)
model.summary(input_shape=(8,3),input_dtype = torch.FloatTensor)

3、 ... and 、 Training models
Training Pytorch It usually requires the user to write a custom training cycle , The code style of the training cycle varies from person to person .
Yes 3 Class typical training cycle code style : Script form training cycle , Function form training cycle , Class form training cycle .
Here is a form of training cycle .
We imitate Keras A high-order model interface is defined Model, Realization fit, validate,predict, summary Method , It is equivalent to user-defined high-level API.
notes : It is difficult to debug the cyclic neural network , You need to set multiple different learning rates and try many times , To achieve better results .
def mspe(y_pred,y_true):
err_percent = (y_true - y_pred)**2/(torch.max(y_true**2,torch.tensor(1e-7)))
return torch.mean(err_percent)
model.compile(loss_func = mspe,optimizer = torch.optim.Adagrad(model.parameters(),lr = 0.1))
def mspe(y_pred,y_true):
err_percent = (y_true - y_pred)**2/(torch.max(y_true**2,torch.tensor(1e-7)))
return torch.mean(err_percent)
model.compile(loss_func = mspe,optimizer = torch.optim.Adagrad(model.parameters(),lr = 0.1))
dfhistory = model.fit(100,dl_train,log_step_freq=10)
Four 、 Evaluation model
Generally, validation set or test set should be set for evaluation model , Because there is less data in this case , We only visualize the iteration of the loss function on the training set .
%matplotlib inline
%config InlineBackend.figure_format = 'svg'
import matplotlib.pyplot as plt
def plot_metric(dfhistory, metric):
train_metrics = dfhistory[metric]
epochs = range(1, len(train_metrics) + 1)
plt.plot(epochs, train_metrics, 'bo--')
plt.title('Training '+ metric)
plt.xlabel("Epochs")
plt.ylabel(metric)
plt.legend(["train_"+metric])
plt.show()
plot_metric(dfhistory,"loss")

5、 ... and 、 Using the model
Here we use the model to predict the end time of the epidemic , namely The newly confirmed cases are 0 Time for .
# Use dfresult Record the existing data and the epidemic data predicted later
dfresult = dfdiff[["confirmed_num","cured_num","dead_num"]].copy()
dfresult.tail()

# After prediction 500 The new trend of days , Add its results to dfresult in
for i in range(500):
arr_input = torch.unsqueeze(torch.from_numpy(dfresult.values[-38:,:]),axis=0)
arr_predict = model.forward(arr_input)
dfpredict = pd.DataFrame(torch.floor(arr_predict).data.numpy(),
columns = dfresult.columns)
dfresult = dfresult.append(dfpredict,ignore_index=True)
tips:
torch.unsqueeze(torch.from_numpy(dfresult.values[-38:,:]),axis=0) In the 0 Add one dimension to the dimension
torch.floor Rounding down
dfresult.query("confirmed_num==0").head()
# The first 50 The new diagnosis was reduced to 0, The first 45 Day correspondence 3 month 10 Japan , That is to say 5 Days later , It is expected that 3 month 15 The new diagnosis was reduced to 0
# notes : The forecast is optimistic

dfresult.query("cured_num==0").head()
# The first 186 The new healing is reduced to 0, That is, about 1 After year .
# notes : The forecast is pessimistic , And there are problems , If you add up the number of people newly cured every day , Will exceed the cumulative number of confirmed cases .

6、 ... and 、 Save the model
# Save model parameters
torch.save(model.net.state_dict(), "./data/model_parameter.pkl")
net_clone = Net()
net_clone.load_state_dict(torch.load("./data/model_parameter.pkl"))
model_clone = torchkeras.Model(net_clone)
model_clone.compile(loss_func = mspe)
# Evaluation model
model_clone.evaluate(dl_train)
tips
Here's a devil detail ,
net_clone = Net()
net_clone.load_state_dict(torch.load("./data/model_parameter.pkl"))
model_clone = torchkeras.Model(net_clone)
You can't reverse the order , Otherwise, the report will be wrong , In fact, there is no need to torchkeras Same training save
summary
Data preprocessing is nothing , Just before use 8 One day's data predicts the next day's data
When the model is built LSTM The input and output of is very important

Use torchkeras Pay attention to the model loading sequence
There's a hole here that I don't understand

I don't understand why this layer is designed like this
Dig a hole for later filling
边栏推荐
- Database SQL language 06 single line function
- Switch to software testing and report to the training class for 3 months. It's a high paying job. Is it reliable?
- How to use js to control the scroll bar of moving div
- Simple use of qlistview of QT (including source code + comments)
- Super comprehensive summary | related improvement codes of orb-slam2 / orb-slam3!
- SSL证书续费相关问题详解
- Promise知识点拾遗
- El table lazy load refresh
- 旋转框目标检测mmrotate v0.3.1 训练DOTA数据集(二)
- 1380. lucky numbers in matrices
猜你喜欢

After getting these performance test decomposition operations, your test path will be more smooth

Sound net, debout dans le "sol" de l'IOT
![[Motrix] download Baidu cloud files using Motrix](/img/d3/f3d29468367cf5011781f20f27a5c8.jpg)
[Motrix] download Baidu cloud files using Motrix

Baiwen.com 7 days Internet of things smart home learning experience punch in the third day

Unity screenshot method

14x1.5cm vertical label is a little difficult, VFP calls bartender to print

Super comprehensive summary | related improvement codes of orb-slam2 / orb-slam3!

Unity shader flat shadow

I have been working as a software testing engineer for 5 years, but I was replaced by an intern. How can I improve myself?

遥感图像/UDA:Curriculum-Style Local-to-Global Adaptation for Cross-Domain Remote Sensing Image Segmentat
随机推荐
[Blue Bridge Road -- bug free code] analysis of AT24C02 storage code
Introduction to mmcv common APIs
Xctf attack and defense world crypto advanced area
旋转框目标检测mmrotate v0.3.1入门
Access is denied encountered when vfpbs calls excel under IIS
[typescript] defines the return value type of promise
[Motrix] download Baidu cloud files using Motrix
Stack overflow caused by C # using protobuf stack overflow
86. separate linked list
Rotating frame target detection mmrotate v0.3.1 learning configuration
The average salary of software testing in 2022 has been released. Have you been averaged?
The definition of strain was originally from stretch_ Ratio started
Unity screenshot method
Codeforces B. MEX and Array
Detailed explanation of the loss module of mmdet
领导:谁再用 Redis 过期监听实现关闭订单,立马滚蛋!
[chestnut sugar GIS] global mapper - how to assign the elevation value of the grid to the point
Simple use of qlistview of QT (including source code + comments)
Bev instance prediction based on monocular camera (iccv 2021)
Delete the repeating elements in the sorting list (simple questions)
