当前位置:网站首页>[secretly kill little partner pytorch20 days] - [day4] - [example of time series data modeling process]
[secretly kill little partner pytorch20 days] - [day4] - [example of time series data modeling process]
2022-06-30 05:43:00 【aJupyter】
System tutorial 20 Heaven takes Pytorch
Recently with Brother Zhong 、 Huige Do a little punch in ,20 God pytorch, This is the first 4 God . Welcome to one button and three links .
List of articles
2020 The outbreak of the novel coronavirus pneumonia in 2008 has caused many aspects of the lives of people of various countries. .
Some students are on income , Some students are emotional , Some students are psychological , There are also students who are overweight .
This paper is based on China 2020 year 3 Epidemic data before June , Establish time series RNN Model , China's novel coronavirus pneumonia outbreak is expected to end. .
import os
import datetime
import torchkeras
# Print time
def printbar():
nowtime = datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S')
print("\n"+"=========="*8 + "%s"%nowtime)
#mac On the system pytorch and matplotlib stay jupyter You need to change the environment variable when running in
os.environ["KMP_DUPLICATE_LIB_OK"]="TRUE"
One , Prepare the data
The data set of this paper is taken from tushare
Data set Overview 
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
%config InlineBackend.figure_format = 'svg'
df = pd.read_csv("/home/mw/input/data6936/eat_pytorch_data/data/covid-19.csv",sep = "\t")
df.plot(x = "date",y = ["confirmed_num","cured_num","dead_num"],figsize=(10,6))
plt.xticks(rotation=60) # Abscissa rotation 60°

dfdata = df.set_index("date")
dfdiff = dfdata.diff(periods=1).dropna() # The null value is deleted after the first-order difference , The null value is actually the first line
dfdiff = dfdiff.reset_index("date") # Cancel date Index identity
dfdiff.plot(x = "date",y = ["confirmed_num","cured_num","dead_num"],figsize=(10,6))
plt.xticks(rotation=60)
dfdiff = dfdiff.drop("date",axis = 1).astype("float32") # Delete time column , And convert to floating point

tips:
df = pd.DataFrame({
'month': [1, 4, 7, 10],
'year': [2012, 2014, 2013, 2014],
'sale': [55, 40, 84, 31]})
# Set a single column as an index
df.set_index('month')
''' year sale month 1 2012 55 4 2014 40 7 2013 84 10 2014 31 '''

Now let's inherit torch.utils.data.Dataset Implement custom time series data set .
torch.utils.data.Dataset Is an abstract class , Users who want to load custom data only need to inherit this class , And override two of them :
len: Realization len(dataset) Returns the size of the entire dataset .
getitem: Used to get some index data , send dataset[i] Returns the... In the dataset i Samples .
Not overriding these two methods will directly return an error .
import torch
from torch import nn
from torch.utils.data import Dataset,DataLoader,TensorDataset
# Use a day ago 8 The day window data is used as the input to predict the data of the day
WINDOW_SIZE = 8
class Covid19Dataset(Dataset):
def __len__(self):
return len(dfdiff) - WINDOW_SIZE
def __getitem__(self,i):
x = dfdiff.loc[i:i+WINDOW_SIZE-1,:]
feature = torch.tensor(x.values)
y = dfdiff.loc[i+WINDOW_SIZE,:]
label = torch.tensor(y.values)
return (feature,label)
ds_train = Covid19Dataset()
# The data is small , You can put all the training data into one batch in , Lifting performance
dl_train = DataLoader(ds_train,batch_size = 38)
import torch
from torch import nn
from torch.utils.data import Dataset,DataLoader,TensorDataset
# Use a day ago 8 The day window data is used as the input to predict the data of the day
WINDOW_SIZE = 8
class Covid19Dataset(Dataset):
def __len__(self):
return len(dfdiff) - WINDOW_SIZE
def __getitem__(self,i):
x = dfdiff.loc[i:i+WINDOW_SIZE-1,:]
feature = torch.tensor(x.values)
y = dfdiff.loc[i+WINDOW_SIZE,:]
label = torch.tensor(y.values)
return (feature,label)
ds_train = Covid19Dataset()
# The data is small , You can put all the training data into one batch in , Lifting performance
dl_train = DataLoader(ds_train,batch_size = 38)
Data processing summary
- time series data , It is to use the data of the previous time to predict the data of the later time
- Perform first-order difference on the data , Then remove the NaN value , structure dataset( Use the data of the first eight days as the training set )
Two 、 Defining models
Use Pytorch There are usually three ways to build models :
- Use nn.Sequential Build models in a hierarchical order
- Inherit nn.Module Base classes build custom models
- Inherit nn.Module The base class builds the model and assists in encapsulating the model container .
Choose the second way to build the model here .
Because the next training cycle in the form of class , We further encapsulate the model into torchkeras Medium Model Class to get something like Keras Functions of medium and high-order model interface .
Model Class actually inherits from nn.Module class .
import torch
from torch import nn
import importlib
import torchkeras
torch.random.seed()
class Block(nn.Module):
def __init__(self):
super(Block,self).__init__()
def forward(self,x,x_input):
x_out = torch.max((1+x)*x_input[:,-1,:],torch.tensor(0.0))
return x_out
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
# 3 layer lstm
self.lstm = nn.LSTM(input_size = 3,hidden_size = 3,num_layers = 5,batch_first = True)
self.linear = nn.Linear(3,3)
self.block = Block()
def forward(self,x_input):
x = self.lstm(x_input)[0][:,-1,:] # Do not use the length dimension of the sequence
x = self.linear(x)
y = self.block(x,x_input)
return y
net = Net()
model = torchkeras.Model(net) # Devil details
print(model)
model.summary(input_shape=(8,3),input_dtype = torch.FloatTensor)

3、 ... and 、 Training models
Training Pytorch It usually requires the user to write a custom training cycle , The code style of the training cycle varies from person to person .
Yes 3 Class typical training cycle code style : Script form training cycle , Function form training cycle , Class form training cycle .
Here is a form of training cycle .
We imitate Keras A high-order model interface is defined Model, Realization fit, validate,predict, summary Method , It is equivalent to user-defined high-level API.
notes : It is difficult to debug the cyclic neural network , You need to set multiple different learning rates and try many times , To achieve better results .
def mspe(y_pred,y_true):
err_percent = (y_true - y_pred)**2/(torch.max(y_true**2,torch.tensor(1e-7)))
return torch.mean(err_percent)
model.compile(loss_func = mspe,optimizer = torch.optim.Adagrad(model.parameters(),lr = 0.1))
def mspe(y_pred,y_true):
err_percent = (y_true - y_pred)**2/(torch.max(y_true**2,torch.tensor(1e-7)))
return torch.mean(err_percent)
model.compile(loss_func = mspe,optimizer = torch.optim.Adagrad(model.parameters(),lr = 0.1))
dfhistory = model.fit(100,dl_train,log_step_freq=10)
Four 、 Evaluation model
Generally, validation set or test set should be set for evaluation model , Because there is less data in this case , We only visualize the iteration of the loss function on the training set .
%matplotlib inline
%config InlineBackend.figure_format = 'svg'
import matplotlib.pyplot as plt
def plot_metric(dfhistory, metric):
train_metrics = dfhistory[metric]
epochs = range(1, len(train_metrics) + 1)
plt.plot(epochs, train_metrics, 'bo--')
plt.title('Training '+ metric)
plt.xlabel("Epochs")
plt.ylabel(metric)
plt.legend(["train_"+metric])
plt.show()
plot_metric(dfhistory,"loss")

5、 ... and 、 Using the model
Here we use the model to predict the end time of the epidemic , namely The newly confirmed cases are 0 Time for .
# Use dfresult Record the existing data and the epidemic data predicted later
dfresult = dfdiff[["confirmed_num","cured_num","dead_num"]].copy()
dfresult.tail()

# After prediction 500 The new trend of days , Add its results to dfresult in
for i in range(500):
arr_input = torch.unsqueeze(torch.from_numpy(dfresult.values[-38:,:]),axis=0)
arr_predict = model.forward(arr_input)
dfpredict = pd.DataFrame(torch.floor(arr_predict).data.numpy(),
columns = dfresult.columns)
dfresult = dfresult.append(dfpredict,ignore_index=True)
tips:
torch.unsqueeze(torch.from_numpy(dfresult.values[-38:,:]),axis=0) In the 0 Add one dimension to the dimension
torch.floor Rounding down
dfresult.query("confirmed_num==0").head()
# The first 50 The new diagnosis was reduced to 0, The first 45 Day correspondence 3 month 10 Japan , That is to say 5 Days later , It is expected that 3 month 15 The new diagnosis was reduced to 0
# notes : The forecast is optimistic

dfresult.query("cured_num==0").head()
# The first 186 The new healing is reduced to 0, That is, about 1 After year .
# notes : The forecast is pessimistic , And there are problems , If you add up the number of people newly cured every day , Will exceed the cumulative number of confirmed cases .

6、 ... and 、 Save the model
# Save model parameters
torch.save(model.net.state_dict(), "./data/model_parameter.pkl")
net_clone = Net()
net_clone.load_state_dict(torch.load("./data/model_parameter.pkl"))
model_clone = torchkeras.Model(net_clone)
model_clone.compile(loss_func = mspe)
# Evaluation model
model_clone.evaluate(dl_train)
tips
Here's a devil detail ,
net_clone = Net()
net_clone.load_state_dict(torch.load("./data/model_parameter.pkl"))
model_clone = torchkeras.Model(net_clone)
You can't reverse the order , Otherwise, the report will be wrong , In fact, there is no need to torchkeras Same training save
summary
Data preprocessing is nothing , Just before use 8 One day's data predicts the next day's data
When the model is built LSTM The input and output of is very important

Use torchkeras Pay attention to the model loading sequence
There's a hole here that I don't understand

I don't understand why this layer is designed like this
Dig a hole for later filling
边栏推荐
- Expansion method of unity scanning circle
- /Path/to/ idiom, not a command
- Database SQL language 03 sorting and paging
- 3D rotation album
- Introduction to mmcv common APIs
- Use the code cloud publicholiday project to determine whether a day is a working day
- 【LeetCode】Easy | 225. Using queue to realize stack (pure C manual tearing queue)
- Array pointers and pointer arrays
- uboot通过终端发送‘r‘字符读取ddr内存大小
- English grammar_ Adjective / adverb Level 3 - superlative
猜你喜欢

C语言基础小操作

Rotating frame target detection mmrotate v0.3.1 training dota data set (II)

Unity shader flat shadow

剑指 Offer 18. 删除链表的节点

What do you think of the deleted chat records? How to restore the deleted chat records on wechat?
![09- [istio] istio service entry](/img/48/86f8ec916201eefc6ca09c45a60a6a.jpg)
09- [istio] istio service entry

MySQL advanced (Advanced SQL statement)

Idea of capturing mobile terminal variant combination

强烈推荐十几款IDEA开发必备的插件

3D rotation album
随机推荐
Do you know how to show the health code in only 2 steps
Rotating frame target detection mmrotate v0.3.1 learning configuration
旋转框目标检测mmrotate v0.3.1入门
Solidy - fallback function - 2 trigger execution modes
[typescript] cannot redeclare block range variables
86. 分隔链表
Video summary of my station B
Database SQL language 04 subquery and grouping function
Database SQL language 03 sorting and paging
Uboot reads the DDR memory size by sending 'R' characters through the terminal
UML tools
Xijiao 21 autumn "motor and drive" online homework answer sheet (I) [standard answer]
How to prevent source code leakage in enterprises and institutions
剑指 Offer 22. 链表中倒数第k个节点
Assembly learning tutorial: accessing memory (3)
Visualization of 3D geological model based on borehole data by map flapping software
Detailed explanation of the loss module of mmdet
Responsive flow layout
Codeforces B. MEX and Array
Projet Web de déploiement du serveur Cloud
